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This paper describes the calculalion of smoothing and prediction operators 
of the linear least-squares sort using techniques derived from a circuit theory 
point of view. The techniques are developed explicitbj for time series which 
are continuous and statistically stationary. Other situations are explored 
more briefly, however, in which the time series are either discrete or statisti- 
cally nonstationary. 

For the most part, functions of time are replaced by functions of frequency, 
representing Oicir transforms. Mathematical complications are avoided by 
restricting statistical ensembles to those which have rational power spectra. 
In practice, actual spectra can be approximated sufficiently well by rational 
spectra, and the simplified methods are sufficiently general for engineering 
applications of many different sorts. Both finite and semi-infinite smoothing 
intervals arc permitted, with or without constraints of various sorts. The 
assumption of rational spectra does not apply directly to nonstationary time 
senes, hid it may be replaced by a closely analogous restriction which does 
apply. Then there are nonstationary operations which are closely analogous 
to the stationary operations, developed for stationary .vj.-itcms. A brief exam- 
inalion of these analogies is of interest, even though the nonstationary opera- 
tions are nsually too complicated for engineering purposes. 

The general techniques are developed in terms of specific problems, chosen 
for purposes of exposition and because of their engineering interest. 
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I. INTRODUCTION 

For a number of years, beginning roughly at the end of World War II, 
there has been a growing interest in theories of optimum smoothing and 
prediction. Much of the work ha.s been concerned with optimum smooth- 
ing and prediction of the linear least-squares sort applied to statistically 
stationary time series — a subject which is both attractive to mathe- 
maticians and important in various engineering problems. 

This paper describes techniques for solving smoothing and prediction 
problems of the stationary, linear, least-squares sort using a circuit 
theory point of view. It avoids the more difficult mathematics of the 
very general, completely rigorous treatments, but maintains sufficient 
generality for many engineering applications. It develops general tech- 
niques in terms of specific engineering problems, which are of real inter- 
est in themselves and may also serve as patterns to be followed in solv- 
ing other problems. Among the problems considered explicitly are the 
following: classical smoothing and prediction problems solved by 
Wiener,' Kolmogoroff,' Zadeh and Ragazzini,' etc.; the simultaneous use 
of different instruments, with different error spectra, for the observation 
of single physical variables; applications of the maihcmaiics of data 
smoothing to circuit design problems which do not actually uivolve data 
smoothing us such. 

The general techniques described here have been developed over a 
period of years. Some of the results have already been stated, in special- 
ized reports describing specific applications of one sort or another.*' ^' ^ 

1.1 Further Background | 

Present-day theories of smoothing and prediction may be said to 
have started with the classic papers of Wiener' and Kolmogoroff,'^ which 
were written during World War II. They assumed linear least-s((uares 
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openitions, .stationary statistics and observations available for all past 
times. Zadeh and Ragazzini^ modified the theory for observations which 
are available only over a past interval of finite duration. By now, many 
other papers have been published, aimed at generalizing, modifying, 
interpreting or applying the original theories. A complete bibliography 
would be very extensive, and will not be attempted in this paper.* 

Differences in points of view can result in quite different formulations 
of smoothing and prediction theory, even though the formulations must 
reflect the same mathematical fundamentals. This is important, because 
the classic papers of Wiener, Kolmogoroff, Zadeh and Ragazzini, etc. 
contain quite formidable mathematics, which is not generally accessible 
to engineers. Much of this mathematics can be avoided by imposing 
certain additional restrictions, which are generally minor in terms of 
resulting restrictions on engineering applications. Further complications 
can he avoided by not re(iuiring perfect rigor in regard to all singular, 
or mathematically "pathological" situations. This point of view is quite 
difTerent from that adopted, for example, by Doob* in his very general 
treatment of smoothing and prediction in terms of the general theory 
of stochastic processes. 

Bode and Shannon'' simplified the derivation of Wiener and Kol- 
mogoroff 's most important result, using circuit theory concepts to inter- 
pret the mathematical operations in physical terms. Their physical inter- 
pretations are very powerful tools for engineers who must solve the 
mathematical problems and, in fact, their paper is our principal reference 
in what follows. Their method of solving the Wiener -Kolmogoroff prob- 
lem, however, does not apply to the Zadeh-Ragazzini problem or to 
various other generalizations of engineering interest (unless it is comph- 
cated in ways which destroy most of its advantages). Furthermore, their 
.solution of the Wiener-KolmogorofT problem itself is not simple in 
numerical applications. 

This paper uses a circuit theory point of view in a somewhat difTerent 
way, which leads to more general applications and to simpler computa- 
tions. The advantages are obtained at the price of an additional mathe- 
matical restriction. Functions of frequency representing statistical 
"power spectra" are required to be rational, whereas more general theo- 
ries allow more general functional forms. f This is a minor re.striction in 
most engineering applications, where nonrational functions can be re- 
placed by rational approximations. 

• For an extensive bibliography (as of 1955) see Stumper.' 

t Bode and Shannon mention rational spectra as simplifying one part of their 
method — the evaluation of a loss-phase integral — but they do not seek other 
simplifications, which mav ho realized liy fi rather different method of solution. 
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Under the a&sumption of rationality, most of the analysis can be car- 
ried out in the frequency domain, in terms of the more usual operations 
of circuit theory. The concepts of generalized Borel fields, measurable 
spaces and even Hilbert spaces need not he used at all. Usually, Wiener- 
Hopf eciuations can be replac^ed liy contour integrals in the complex 
plane. When Wiener-Hopf equations do appear, they may be replaced 
quickly by conditions applied to the analytic properties of functions of 
frecjuency. This avoids the usual difficulties with "5 functions" and their 
derivatives, and states conditions in forms more familiar to circuit 
theorists. Frequently, end results may be expressed as conditions which 
determine network zeros and poles more or less directly. These circum- 
stances all depend, however, on the basic assumptions of linear least- 
squares smoothing. For the simple methods, the assumption of stationary 
statistics also is essential; more complicated analogous methods apply 
to time-variable situations, at least in principle. Nonstationary systems 
are discussed in this paper only briefly, in Section 3.6. 

For the most part, continuous-data systems are assumed. However, 
the techniques developed for continuous data can readily be adapted to 
sampled-data problems, by methods w'hich are outlined in Section 3.5. 
These methods have not yet been compared in detail with more direct 
methods of handling sampled-data problems such as, for example, those 
of Levinson (Ref. 1, Appendix B), and Lloyd and McMillan.* 

Chang" has described a frequency-domain equivalent of Wiener and 
Kolmogoroff's central result. He starts with contour integration, but docs 
not simplify the solution by assuming rational spectra. He does not 
extend the method to the finite memory problem solved by Zadeh and 
Ragazzini. Zadeh and Ragazzini themselves describe a solution which 
assumes rational spectra, but they use a time-domain analysis which is 
less simple than an analysis in the frequency domain. Laning and Batten'^ 
also describe smoothing and px'ediction in time-domain terms, subject 
to the assumption of rational spectra. 

In Ref. 11 Chang also points out that the mathematics of smoothing 
and prediction may be applied to netw^ork synthesis problems which do 
not actually involve data smoothing or prediction. Basically, Chang 
proposes designing to a least-squares error criterion, where the error is 
the magnitude of the complex difference between a physical transfer 
function and a nonphysical ideal function which it is to approximate. 
This appears to be a quite rewarding approach to various nonstochastic 
problems in network synthesis, particularly after the frequency- 
domain method has been extended to more general smoothing and 
prediction problems, i 
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Within the general field of data smoothing, an important variation of 
the classical problem is as follows: In the classical problem, one is con- 
cerned with the optimum smoothing and prediction of a statistical signal, 
contaminated with a statistical noise, when the statistics of the signal 
and noise arc known. In the variation, one is concerned with the simul- 
taneous use of Iwu different instruments to measure a single physical 
variable or signal. In the simplest form, the readings of the two instru- 
ments are combined through optimum linear operations, subject to the 
condition that the net error is to depend only on the instrumental error. 
Then the signal statistics do not enter at all, but two different statistics 
must still be considered, corresponding to the two different instrumental 
errors. If the errors of the two instruments have quite different frequency 
characteristics, the two-instrument combination can give much greater 
accuracy than either instrument alone. This may be compared with the 
use of "woofer" and "tweeter" spenkers in high fidelity sound systems. 

The techniques described here wore developed, to a considerable 
extent, in connection with specific applications of the two-instrument 
prolilem described in Refs. -1, o and 0. The two-instrument optimization 
problem was suggested by previous uses of two kinds of instruments, 
combined through arbitrary, non-optimum Imear operations.* More 
recently, two-instnmient optimization principles have been described 
in papers by Bondat," and Stewart and Parks.'" 

1.2 Organization of the Paper 

The remainder of this paper is organized as follows: Section II formu- 
lates a fairly general smoothing and prediction problem in mathematical 
and physical terms. At the same time it reviews certain mathematical 
relations which will be needed in the sequel, including some elementary 
Fourier transforms, some properties of "physical" networks and some 
properties of stationary Gaussian noise. 

Section III develops techniques for solving the general problem. The 
techniques are explained in terms of specific problems, which are special 
cases of the general problem (or reasonable variations of it) and are 
especially suitable for purposes of explanation. Section IV describes 
other specific prol)lems and engineering applications which have been 
chosen primarily for their practical interest. 

Some of the specific problems illustrate existmg engmeering applica- 
tions. Others are merely potentially useful or of interest for largely the- 

* Examples are: an inatrumeot made by North American Aviatioo for the 
Saridia Corporation {wliich furnished a starting point for Ref. 4) and a proposal 
of Crooks.'^ 
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oretical reasons. Some of the problems may not have been solved before. 
Others have well-known solutions, in one form or another, and are in- 
cluded purely to illustrate the generality and efficiency of the tech- 
ni(iues under discussion. 

A more detailed outline of the paper may be seen in the table of con- 
tents at the beginning of this paper. 

II. FORMULATION OF GENERAL RELATIONS 

In this section we formulate a central problem, in about the same way 
as Bode and Shannon, and review some mathematics which will be needed 
in the sequel. In later sections, we shall modify some of the details, but 
within a set of fundamental restrictions which are included in the for- 
mulation described below. 

2.1 Th^ Central Problem 

The central problem is as follows: We are given a time function /(()j 
representing a signal s{t) contaminated by noise n{i) : | 

/(/) = s{i) + nit). (1) 

The time functions s{t) and n[l) are drawn from statistical en.sembles of 
such functions, and we assume that the pertinent statistical characteris- 
tics of the ensembles are known. 

We are to derive from /(/) an estimate g{i) of s{t -\- a). When a is 
positive, g{t) is a prediction of what the true .signal s will be a seconds 
from present time /. When a is negative, a = —0 and g{t) is an estimate 
of what the true signal was /3 seconds before present time l. The opera- 
tions to be used in deriving g(l) from /(/) are restricted in various ways. 
Then g{t) generally will not matx^h sit + a) exactly, but will be in error, 
by an amount e(/) : 

g{t) = 8{( + a) + t{t). (2) 

The permitted operations are to be u.sed in such a way that g(t) is an 
optimum estimate of s{t -\- a), as judged by a specific criterion applied 
to the statistics of the error t(t). The problem is to find the specific com- 
bination of operations within the permitted operations which will, in 
fad, yield the optimum g(l). 

An engineering representation of the problem is illustrated in Fig. 1. 
The "observed" signal f{t) differs from the "true" signal sit) by the noise 
n{t). The observed signal is to be modified by passing it through some 
sort of device, such as an electrical network, to obtain the output signal 
7(0, which is to represent an estimate of s(t -\- a). The action of the de- 



Fig. 1 — A physical representation of the :=noothin? and prediction problein. 

vice is described by a mathematicai operator ' -.vbich is simply s. 
symbolic representation of the corresponding mathematical process re- 
lating g{l) to/(/). If the device must be chosen from some class of per- 
mitted devices, this class of permitted devices will determine a class of 
permitted operators. The problem is to determine the optimum operator 
within the permitted class, as a first step in designing an optimum device. 

2.2 General Assumptions 

The specific method of analysis depends on specific assumptions re- 
garding the statistics of the signal and noise, the criterion iised to define 
optimum estimates and the class of permitted operators. The six condi- 
tions stated below will be assumed throughout this paper, except in a 
few instances where certain specific departures will be noted. Other con- 
ditions will vary with different circumstances, considered in different 
sections, and will be noted when needed. 

i.* The signal and noi.se statistics are assumed to be stationary. Thus, 
statistical characteristics which refer to a single time are the same for all 
times, correlations involving more than one time depend only on time 
differences, etc. 

ii. The criterion used to define an optimum estimate is to be the aver- 
age square error, or variance a = ave e . In other words, the permitted 
operations are to be used in such a way that a- is a minimum. 

iii. The permitted operations are to be linear. A linear operation, ap- 
plied to/(/), may yield a sum of terms of the following sorts: values of 
fit) at specific times, derivatives of /(/) of any order, weighted integrals 

oifit). 

iv.f The "power spectra" of the signal and noise are to be rational 
functions of frequency w — real when w is real. (Power spectra are 
Fourier transforms of covariance functions.) 

V. The statistics and the permitted operations must be such that esti- 
mates with hounded average square errors are possible. 

vi. In general, the permitted operations will be an assigned subset of 
the class of all linear operations, (for example, the <'lass of all "physical" 

* Except in Section 3.6, which shows how the methods used for stationary 
Ptatiatics may be applied to time-variable systems of certain very special kinds. 
t Except for possible departures in Sections 3.2.3 and 4.3. 
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linear operations). Sums and differences of permitted operations will 
always be linear operations, but they will not always be permitted linear 
operations. However, we will assume that the class Cy , of permitted 
operations F, always has the following property: If Yx and Y2 are per- 
mitted operations and h is an arbitrary positive or negative real con- 
stant, there must always exist a permitted operation F3 such that* 

n - Fi = /c(F2 - Fi). (3) 

Conditions i, ii and iii are fundamental to the theories of Wiener,' 
Kolmogoroff,^ Zadeh and Ragazzini,^ and Bode and Shannon.^ It is these 
conditions which make the mathematics tractable. Conditions i and iii 
are clearly appropriate for a treatment using conventional tiieories of 
fixed linear circuits. Under condition ii, the optimization depends only 
on linear correlations, as will be confirmed in Section 2.6. Then the ac- 
tual statistics may be replaced by any more convenient statistical models 
which have the same linear correlations. Bode and Shannon discuss ways 
in which these three assumptions do and do not limit engineering appli- 
cations. The limits should be clearly understood before practical applica- 
tions are attempted. 

Under conditions i, ii and iii, with no further restrictions, mathemati- 
cally "pathological" situations must be accounted for, and these lead 
to quite formidable (although tractable) mathematics. Condition iv, 
assumed here, excludes the more pathological situations. The resulting 
simplifications in the necessary mathematics are very substantial. While 
the requirement of rational spectra is an arbitrary restriction, it does not 
restrict engineering apphcations to a serious extent. The nonrational 
spectra usually encountered can be approximated sufficiently well with 
rational functions. 

Condition v is not a significant restriction on the usefulness of a design 
method. The convergence of certain integrals in which we will be inter- 
ested depends on this condition, and it is stated here for ready reference. 
Note that v does not require convergence of the integrals of the signal 
and noise spectra alone, provided the permitted operations can lead to 
estimates with bounded average square errors. (See, for example, Sec- 
tion 4.1.1.) 

Condition vi guarantees that the optimum permitted operator will 
correspond to a "stationary point" in the usual calculus of variations 

* L, A. MacColl has shown that condition vi is equivalent to the following: 
Let Fo be any one permitted operation, and let V be the class of operations 
Y ~ Yo . Then y is a "linear aubspace" and Cy a "flat subspace" in the linear 
vector space of all linear operations. 
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sense. It can frequently be simplified to the following: If Yi is a permitted 
operator, then kVi is a permitted operator. We are going to examine cer- 
tain constraints, however, which exclude the simple form. The typical 
constraint of this sort permits only linear operations Y which make no 
change in in some particular (specified) time function (for example, a 
constant, or dc, signal). If /o(0 i^^ the particular time function, and 
Y-Mt) is the result of applymg operator Y to fo{t), the constraint re- 
quires F-/o(0 = /o(0- But then ikY)-Mt) becomes kMt), and is not 
permitted. On tlie other hand, (Y2 - Yi) -Mt) becomes [/o(0 - Mt)] = 0, 
and the same is true of k{Y2 — Yi). 

2.3 SuhsHtidion of Gaussian Ensembles 

We now replace the actual signal and noise ensembles by Gaussian en- 
sembles with the same linear correlations (as permitted under assumption 
ii). For a more specific physical representation, we may think of the new 
f{t) and g{t) as electrical signals (voltages or currents), provided the per- 
tinent statistical characteristics are retained. Stationary Gaussian en- 
sembles may bo generated by passing white noise through (idealized) 
linear networks. Under assumption iii, the operations used to derive 
g{t) from f{t) also correspond to some linear network. Then, Fig. 1 may 
be replaced by Fig. 2. The two white noise ensembles are uncorrelated 
(assuming that signal and noise are uncorrelated). Their spectral densi- 
ties arc unity (scale factors appear as gains or losses associated with the 
networks, which are permitted to include amphfiers). The linear opera- 
tions performed by the networks may be represented by frequency func- 
tions Ysip), Yxip), Yo{p), where p - ias. Responses to any time func- 
tions may be found from the frecjuency functions by means of Fourier 
transforms. We shall say more about these later on, and also about the 
properties of white noise. 



I WHITE NOISE 
I SOURCE NOJ 



WHITE NOISE 



Ys(p) 



SOURCE NO. 2 



Vn(P) 




3(t) 



n(t) 



fd) 



(OBSERVED) 



Yg(P) 



g(t) 



Fig. 2 — A Gaussian physical moJcl. 
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WHITE NOISE 



SOURCE 



yf(p; 



f(t) 



Fig. 3 — An alterDative physical model for/(/). 

The noise sources themselves are not available to the observer, who 
sees only/(/}. The noise sources and associated networks corresponding 
to Y s{p) and Ys{p) are merely imagined devices which permit the per- 
tinent signal and noise statistics to be described in physical terms. Our 
problem is to find that particular permitted Y aij)) (regarded as a linear 
operator) which converts /{/) into the optimum g{t). 

In Fig. '2,f(t) is the sum of two Gaussian ensembles. Suice the sum of 
two Gaussian ensembles is a Gaussian ensemble, /(/) may be represented 
more simply, as in Fig. iJ. This representation does not show the corre- 
lation between /{O and sit) or n{l), but it will be useful in some of our 
analysis, The power spectrum oif{t), viewed as a single ensemble, is the 
sum of the power spectra of the (uncorrelated) signal and noise. 

The auto-covariance of any one of the ensembles (signal, noise or sig- 
nal plus noise) may be specified in any one of three ways: directly as a 
function of time difference [average of s(l) s{t + t), etc.); by means of the 
power spectrum (which is the Fourier transform of the auto-covariance 
function); or by means of a network function, )'>■ , Y.t or I'V of Figs. 2 
or 8 (from which the power spectrum can easily be computed). We will 
use the following notation: 



Ensemble 


Aulo-covariance Funtlion 


PowiT Spectrum 


Network Function 


sit) 

nit) 

f(t) = s(0 + n{t} 


*a(r) 

*t/{T) 

*Fir) 


s 

N 
F ~ S + N 


ncp) 

YnCp) 
YFip) 



Let E and Ye be any of the three spectra and its corresponding net- 
work function. Let 9{p) designate Y{ — p): 



?{p) ^ Yi-p). 
Then an elementiiry property ot power spectra requires:* 

E = YE{p)YEip). 



(4) 



(5) 



We need to know how to find Ye when E is given. In general, the rela- 
tion of Yk to E involves the general loss-phase integrals, as described by 

* Related to eqiuitiuns (T-9) and (T-14) of Table I, Section 2.5, and the jiroper- 
ties of white noise. 
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Bode.'^ Under our assumption of rationality, however, the loss-phase 
integrals can be replaced by simple relations between zeros and poles 
(idso in accordance with Bode'^). 

Equation (5) makes E an even function of p, and also an even function 
of to (since p' = — w^). At veal frequencies, E = | Kb(zco) | and is non- 
negative. The zeros of E occur in positive and negative pairs, hke -\-p^ 
and —p„ , and so do the poles. Of each pair, one is a zero or pole of Ve 
and the other a zero or pole of fs ■ When E is given, the zeros and poles 
can be arranged in + and — pairs, and one of each pair can be assigned 
to Ys . The possible assignments are not unique, however, unless some 
further restriction is imposed. For our purposes, we will need the specific 
assignment described below. 



Ihp 



rhp 

REAL p 



-P<r 



Fig. 4 — The complex plane for p = ju. 

Referring to the complex plane for p oriented as in Fig. 4, if p^ lies on 
one side of the real w axis, —p„ lies on the other side. Using "Ihp" to 
designate "left-half plane", we require: 

The zeros and poles of Ve are the Ihp zeros and poles of E, 

(G) 

where E = S,N, F. 

There may also be zeros of E on the real co axis, but these always occur 
ill identical pairs, one of which goes to Ye * 

2.4 Properties of Physical Frequency Functions 

Let "rrhp" designate "regular in (the finite part of) the right-haif 
plane." Then an inimediate consequence of (0) is 

I'k and 1/1\ are rrhp where E - S, N, F. (7) 

* As an immediate consequence of the nonnegative character oi E = | l'(iw) |^ 
at real frequencies. 
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Tlieii both Ye and 1/Ye are "physically realizable" iu the general sense 
of Bode. * Under assumption iv they arc realizable with finite networks 
of lumped elements, provided these are idealized to include multiple 
poles at p = ^ (which can be approximated within reasonable limits by 
active circuits). 

The function Ya(p) 'u\ Fig. 2, which converts the observed f(t) into 
the estimate git), may not be rational even though Ys and Yjv are ra- 
tional. When a nonrational V'g is required to be "physii^al", it must still 
be regular in the finite part of the rhp. Now, however, it may have an 
essential singularity a.tp= ^ , and this must meet an additional restric- 
tion. 

A general definition of a "physical" frequency function, Y{p), merely 
requires that it be "causal". A causal Y(p), applied as an operator to 
any time function /(i), produces a response g(t) which depends only on 
present and past \'aluos of J(t). Causal fretiuency functions have been 
studied ui very general terms, by Wiener/ Beurling/' Nyman'* and 
Youla, Castriota, and Carlin,'" but the general mathematics is relatively 
complicated. A less comprehensive description of the conditions for physi- 
cahiess will be sufficient for our purposes. Let the definition of rrhp be 
extended, arbitrarily, to 

rrhp means: 

a. Regular in the finite part of the rhp, 

,„ - (8) 

6. Approaches Z_,m.y (C^_yp"'e '*^) as p — ► «, 

m = iintcger, 

y = real and ^0, 

C,„, y = real. 

Then, for purposes of this paper, it is sufficient to define "physical" by: 

A physical Y(p) is any real function of p which is rrhp. (9) 

(A real function of p is one which is real when p is real.) 

In (8), the second condition permits Y(p) to behave like p"'e~''' at 

p ^ 00 , provided 7 is positive. Behaviors of this kind can be obtained 

with networks of lumped elements and ideal delay lines. 

We will need cori'esponding relations describing Y(p) = Y( — p). Let 

rlhp be defined by 

* As used here "pliysically reiilizable" iiirludes the refiuiiemcut of stability. 

This is the usual iiil.eri)retatioii of the eonveiiliimat theory of liuesir networks. 
An unstable linear device can in fact exist and it eiin be driven by an input which 
ia 11 general function of time, but only over a fiuitc time interval. 
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rlhp means: 

a. lleeular in the finite part of the Ihp, 

& I' f, ^^^^ 

b. Approaches J^™. y (C™, ^p^e"*"^") as p ^ co, 

7n = iinteger, 
7 ^ real and ^0, 
C,n. y = real. 
Then, it follows from (8) and (9) tliat: 

A physical Y{p) is any real function of p which is rlhp. (11) 

2.4.1 An Essential Integral Theorem 

Bodc^^ has derived a number of special properties of "physical" func- 
tions, in terms of integrals in the complex p plane. These include the loss- 
phase relations, the uitegral in the definition of resistance efficiency and 
other simihir integrals. One particularly simple theorem of this sort is 
essential to our method of solving smoothing and prediction problems. 

For our purposes, the theorem may be stated as follows: 

If : a. H(p) is either rrhp or rlph and 

_, (12) 

h. I H{iu) I = Ooj " when o is real and — » m *, 

/+CO 
H{ioi) do) = 0. 
CO 

The function li{p) is not necessarily one of our network fun(!tions, 
Yk{p) or Ya{p), provided it is rrhp or rlhp in the sense of (8) or (10) 
and also meets the convergence condition. Generally, it will be a combina- 
tion of our network and spectral functions. 

The theorem is easily proved by closhig the path of integration with 
an arc at =©. The arc encloses either the rhp or the Ihp, as in Fig. 5(a) 
or rj(b), depending upon whether H{p) is rrhp or rlhp. Because of (12a) 
the integration around the closed contour is 0. Because of {\2b), the in- 
tegral over the arc at «■ is [provided 7 ^ in (8) or (10), as required]. 

2.5 Fourier Transforms 

Equations (1) and (2) describe our problem in terms of time functions, 
while Fig. 2 describes it in terms of frequency functions (and the proper- 

* The symbol O has the following meaniug: if q{u) = Oriu), then 9(w)/r(w) is 
bounded; thus, H -= OoT'' means u^H is bounded. 
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ties of white noise). In general, a facility at transforming quickly between 
time-domain and frequency-domain formulations is an important tool in 
smoothing and prediction problems. Time-domain niid frequeii(^y-do- 
main formulations are, of course, mathematically equivalent, and are 
related by Fourier transformations. A few elementary theorems regard- 
ing Fourier transforms which will be referred to in later sections are 
reviewed in Table I. 

In order for the Fourier transforms to exist, certain mild conditions 
must be met. We will assume, a priori, that the conditions are satisfied 
wherever we use the transforms. This is a departure from strict rigor, but 
our use of the transforms will be entirely reasonable, under our assump- 
tions iv and v. 

When Y(p) is the frequency function of a network or equivalent de- 
vice, Y{iu)) indicates the steady state response to a sinusoidal mput. The 
inverse transform, K(i), is the response to an ideal unit impulse, or 5 
function, applied at time t = 0. A general input time function, /(i), may 
be thought of as a series of impulses. The effect of any one impulse on the 
response g{t') at a given time t depends on the amplitude of the impulse 
and on the length of time which has elapsed since the impulse was ap- 
pUed. Then 



git) = r'^fii - r)K{r) dr = fit) * Kit). 

J~oa 



(13) 



Thus, git) is a weighted integral of /(/), in which the weight factor is 
/C(t) and T represents the "age of data". 
The response of a physical device cannot depend on future inputs. 





Fig. 5 — Arcs at infinity, enclosing half -planes. 
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Table I — Fourier Transforms 

Definitions 



Fourier Transform: 



Inverse Tninsform: 



Convolution: 



F(iw 



JC(Oe-*"'di 



Kit) ■ ^ f nico) e+*"' d^ 



A', (0 • K: (0 



/+M 



A'i(( - r) Kj(r) dr 



Delta Function: 



&{t) = when I j^ 0, I 5(0 (/i = 1 



"•i: 



(T-l) 



(T-2) 



(T-3) 



(T-4) 



Some Fourier Transform Pairs 



K(-t) 




F(-p) = fiv) 


(T-5) 


kKU) 




kY(p) 


(T-6) 


luit) + K^it) 




YAp) + Y,(p) 


(T-7) 


6{t) 




l/\/2n- 


(T-8) 


K{1 - T) 




y{p)e''^>- 


(T-9) 


/v,(0 »K->(t) 




r,(p)i'=(p) 


(T-10) 


KU) * K{-t) 




r(p)r(p) 

y(p) rrfip* 


(T-11) 


Kit) = 0, ( < 




(T-12) 


K{t) = 0, ( > 




Yip) rlhp* 


(T43) 


Some Related Formulas 


Parseval's Equation: 








j " A',(0 K,it) dl = 


.+00 
: / YS<^) ^^(^w) d« 

V — oo 


(T44) 



r 



FM i'(i« 



If l'(p) = real when p is real: 

.+00 ^+= 

J— to 

Power Spectrum = Fourier Transform of Covariance. 

If git) is the response of a linear device to an input /(() and if git) = Yip)eP' 
when fit) = C'and git) = Kit) wheii/{() = S((), then Yip) and A(0 area Fourier 
transform pair and, for a general /(/J , 



(T-I5) 
(T-16) 



git) =fil) * A(0 = Kit) *f(t) 



(T-17) 



* Provided certain pathological Y{p) andif (/) are excluded, in a way coDsistent 

with our definitions of "rrhp," "rllip," and "physical." 
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Thus, (13) requires: 

// K(t) is physical, K(t) = when t < 0. (14) 

When ^ ^ 0, K{t) need not be particularly well behaved, for it can in- 
clude S functions and their derivatives. If it contains nothing worse than 
derivatives of 5 functions it can be approximated with combinations of 
transversal filters and differentiators. 

2.6 A Variational Condition, Equivalent to the Optimization Requirement. 

In Figs. 2 and 3 our inputs are drawn from unit-level white noise en- 
sembles. White noise may be described in either frequency-domain or 
time-domain terms, in ac(!ordance with Riee.^" In the frequency domain 
it is a sum of sinusoids of all frequencies, with phases that are completely 
random. The "spectral density" is constant over all frecjuencies. If the 
white noise is applied to a network described by Y{p) the corresponding 
output has similar properties, except that the amplitudes of the sinu- 
soids of different frequencies, w, are changed by a factor | Y{iii>) | . Then 
the power spectrum of the ensemble has density E given by the follow- 
ing (at real frequencies) : 

E = \Y{i^)\' = np)V{p). (15) 

Since phases arc initially entirely random, pha.'^es added by the not- 
work do not change the character of the ensemble. Changing the phase 
of Y{ia>) without changing j Y{iui) \ merely maps individual time func- 
tions into other, equally probable time functions of the same ensemble. 

The average square, a", of the response to the white noise, is simply the 
sum of the average stiuares for the individual frequencies. Thus, 

I Y(iio) \- do} = 2 I Yiio}) \-du}, (16) 

(s Jo 

Here, a^ represents both the "time average" and the "ensemble average", 
since the two are identical when they refer to a stationary Gaussian en- 
semble. The two ranges of integration are c([ually permissible, as shown, 
because | F(tw) |" is an even function of to.* 

In the time domain, the white noise may be described as a sequence 
of impulses, with infinitesimal spacing along the time scale. The ampli- 
tudes of the impulses are uncoiTelated Gaus.sian random variables. An 

* It ia assumed here thiit a spectrum E is so scaled that integrjiting E from 
0) = —00 to+oo gives lliK variance, ff'''. "Unit level" white noise is to he consistent 
with this assumption anil (15). Sometimes the scale of E is doubled, ho that inte- 
grating E from m = to =o gives o-^. 
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impulse ;tt lime / — t contributes to the response ol' a network at time 
i in proporlion to K{t) and u' may now be expressed as the sum of the 
a\-ei-afi;e sfiuares eontributed by the individual (uncorrelated) impulses. 
AVheii limits are taken properly, the result is: 



/+00 
■10 



{K{i)f <U. 



(17) 



Equations (10) and (17) are, of course, consistent with (T-15), in Tiilile 
I, which is a special form of Parseval's equation. 

Reforrinf? again to Fig. 2, if the output ii{t) is interpreted as an esti- 
mate of the true future signal ,s(; + a), the error e{t) must l)e 



^(0 - g{t) - s(( + a). 



(18) 



WHITE NOISE 



SOURCE N0.1 



WHITE NOI5E 



SOURCE NO. 2 



Ys(P) 



M^) 



S(t) 



n(t) 



s(t) 



f(t) 



e«P 



\w 



s[t+«) 



(error) 



g(t) 



Fig. 6 — A physicjil model for the calculsition of the error *(()■ 



We may think of e(/) as the output of the (unattainalile) circuit shown in 
lug. 0, in which the responses of the different parts are again dcscrilied in 
frefiuency domain terms.* The average squared error a^ is now the sum 
of the contributions from the two uncorrelated white noi.se sources. 
E\'aluating these in terms of (.16) gives 



00 



(I Yol'N + 1 r« - e"'"'^ I- .?)(/. 



(19) 



The optimization problem may now be stated as follows: Given the 
cla.ss of permitted functions I'c corresponding to all networks of the per- 
mitted sort, find the particular Yg , say }',i/ , such that a' is a mininumi. 
Assume tentatively that I'.w exists and let Ay{p) i)e ilefined l)y 



Yoip) = Ym{p) -h Ay{p). 



(20) 



* When a > 0, the box marked e"" is nonphysical, since c"" corresponds to a 
negative delay, or ideal prediction. So long as the noise i.s pre-seiit. the box is 
connected In an uiKivaihd)lc signal source, s(i) without nit), whatever the value of 
a. These circumstances are wluit mako.-i e(/) the error, inatead of an observable 
correction wliich could be used to determine s{t + a) exactly. 
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Under assumption vi, if K«(p) in (20) is a permitted Vaip), then 

Vm(p) + kAy{7)) 

is a permitted Ya(p), where k is any positive or negative real constant. 
Substituting Ym + kAyior Fg in (19) gives: 



(21 



J— 00 

+ k^ f {N -i- S)\Ar f rfco 

J— 00 

-{-k ( [Ym(N + S) - e'""S\Ay rfw 

-^ k f [Y^,iN + S) - e~"'"S]Ay dw. 

In this expression the last two integrals are equal, for the following rea- 
sons: First, each of the two integrals is real, for the imaginary part of the 
integrand is always an odd function of to. Second, the two integrals can 
at most be conjugates of ea(^h other, since their integrands are conjugates 
at real w. Replacing the two integrals by twice the first leaves 

.+00 



= I (|y^|^v+ \Yu -€'''•' \\s) do 

-\- k' ( {N -\- S)\Ayfdo} (22) 

+ 2k f [Y„{N -^-S) - e"'"5]Ar rfo.. 

J— 00 



When k = Oy Y m -\- kAy = Yu , and o-^ must be a minimum. This will 
be true only if the coefficient of k in (22) is zero; hence the third integral 
must be zero. Furthermore, for Ym to be a true optimum, the integral 
must be zero whatever permitted Ya is used in (20) in determining Ay . 
If the variable of integration, w, is replaced by p = w (for convenience 
in what follows), this requires: 

For every permitted Ar : 



L 



+ iiB 



[Y,AN + S) - e'"'S]Ay dp = 0. (23) 



Our principal concern, in Sections III and IV, will be the solution of 
(23) for Y M , for different classes of permitted functions Ya . Generally, 
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the cla.ss of permitted functions mil exclude the obvious solution which 
makes the liraeket, [ ], in (23) identically zero. Then the integrand will 
not be identically zero, but will have to be such that the integral will be 
zero for every 5r which may be derived by using a permitted Ya in 
(20). 

When (23) is true, (22) may be written as follows, for every permitted 
Ya: 

a' = a,r + k' I {N + S)\^y I' di^ 

J—aa 

+- (24) 

a,r = f " (1 r« f iV + I r^ - e"''- \'s) rfco. 

J— 00 



Here a^' ik the a'- achieved with Ya = Ym , and the second term in a" is 
clearly nonnegative. 

K(iuation (24) implies the following situation: Under assumptions v 
and vi, a true minimum variance o-m" exists (at least as a limit of a se- 
quence of o-^'.s corresponding to a sequence of permitted functions Yg). 
Any solution of (23) for Ym within the class of permitted functions Ya 
must yield the true 0-/. Smce iV + ^ is nonnegative in (23), no other 
permitted Ya can yield a smaller a. When N -\- S'm also nonzero at real 
frequencies, no other Y<i can yield as small a a', and the solution for Ym 
is at once nniciuc. 

When N + *S' is zero at one or more real frequencies, the situation re- 
garding uni(|ueness is not so clear, for any Aj- which is nonzero at those 
frequencies but zero at all other real frequencies will lead to a new Yg , 
yielding the same a = a-.w^ Under assumption iv, N + -S can be zero 
only at discrete frequencies. As a result, if there are two sohitions for Y^ 
in (23), at least one must include transfer functions of filters with in- 
finitesimal bandwidths. It is questionable whether zero-bandwidth filters 
may be called ''physical"; hi any event, they cannot be built and alter- 
native solutions do not reduce o■.w^ Accordingly, any solution of (23) for 
Ym which docs not inckide transfer functions of zero-bandwidth filters 
will be called vmi(iue, whether or not N -\- S has zeros at real frequencies. 

Certain statements about convergence will be useful later on. Under 
assumption v, )',,, will make cr" finite. In seeking 5'.,, , then, we may start 
by excluding all I'o for which a' = ^o . Since the two integrals in (24) 
are nonnegative, a' will be liounded only if both integrals converge. Each 
of the two integrands is a sum of two nonnegative terms. It follows that 
each term must meet convergence conditions. Combining these yields 
an additional useful condition, which will be satisfied by the integrand in 
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(23). The five conditions may be written collectively as follows; 

When cc is real and —> «= the following five func- 
tions = Ou^'-.i 

1 ArTiV; \Ay\' S (25) 

[Y^N + S) - e"''S]Ay 

2.6.1 An Equivalent Formulation in the Time Domain 

Time-domain equivalents of (2.3) and (24) can easily be derived. The 
above analysis can be paralleled in time-domahi terms or, alternatively, 
(T-14) and (T-15) of Table I can be applied directly to (23) and (24). 
Let Ko{t), KmQ), Ak(^) be the inverse transforms of Yaiv), Y ^{p), 
Arip). Then (23) becomes: 

For every permitted Ak , 

f [K„(t) * Mr) - Mr + a)]AM dr = 0. (26) 

Equation (24) can be transformed into various time-domain equivalents, 
of which the following is perhaps the most interesting: 

<r' = <Tm' + / [Kf * AkT dr, 

J— 00 

(27) 
a,r = ( {(K^f * K^f + [{Ks, - 5(r + a)) * K,f\ dr. 

V — 00 

2.7 Breadth of the Optimum 

The Ym determined by (23) may or may not be realizable with a finite 
network of lumped elements, even though assumption iv insures that 
Yn and Ys of I^'ig. 2 could be so realized. When Ym (cannot be realized 
with a finite network of lumped elements, it may be necessary to replace 
Y^f by a reasonable approximation to it, which can be so realized (and 
similarly for ctiuivalent nonelectrical devices). A rea.sonable approxima- 
tion will be one which can be realized in a reasonable way, and which 
makes (/ only a little greater than the minimum, om^. 

The approximation of general "physical" network functions with 
"finite network approximations" is a familiar problem in general network 

t Recall the meaning of noted iu Section 2.4.1. 
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theory, whirh need not he developed here. In this coniieetioii, however, 
it is important tu note the situation descrihetl hriefly !)elo\v. 

In engineering proi)lems, the minimum exhibited liy <r" as a function 
of Ya is usually <iuil.e l)road. In other words, F« may l)e made ([uitc a 
little different from }'.i/ without increasing a" very much. Tliis has not 
been proved but is simply a matter of experience in problems of the en- 
gineering sort (and, in fact, it cannot even be stated exactly without 
assigning a more quantitative meanmg to the expression "broad mini- 
mum") . 

A iM-oad minimum does not mean that all small departures from Y^, 
have small effects on c". 1 "or example, a change from order c/p" atp = ^ , 
to order of c/p""' may yield only small departures from Y.uip) at all 
real frequencies, but it is likely to change a ^ ct.i," into (t^ = » . Gener- 
ally, reasonable percentage changes, relative to the magnitude of l'.i, at 
corresponding freciueneies, may be tolerated. The percentage changes 
may be freciuency-dependent, and may be real or complex. The specific 
sensitivities to changes, however, will depend on the specific values of 
A^ and S. 

The effect of specific departures from 7« , in specific problems, may 
be calculated l)y means of (24) or (27). 

III. GENERAL TECHNIQUES, IN TERMS OF SPECIFIC PROBLEMS 

The remainder of the paper describes the calculation of Yn from (23), 
and modifications thereof. The specific Ym determined by (2;j) depends 
on the class of permitted functions Yg , mthin which Ym is to be the op- 
timum {■hoiee. A lunnber of difTercnt classes are of interest, on both 
theoretical and practical ground.s. I'urthermore, the appropriate tech- 
niques for calculating Ym vary -mth the permitted class, in Avays which 
are likely to be nontrivial. 

In this section we con.sider some fairly general classes of permitted 
functions, which will illustrate general techniques. In Section IV we shall 
examine variations and special cases, chosen primarily for their engineer- 
ing interest. 

In describing the properties of the different classes, it will be convenient 
to use the foUowing general notation: 

a. Cy = the class of permitted functions Yn , within which Y^ 

is to be the optimum choice; 

(28) 

b. CU = the chiss of functions Ygi - Yg^_ , where 1'^ and Y02 

are any two Yg in d- . 
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2 

a: 




3^ 


-X 


: LARGE 



NON- PHYSICAL K^^ = INVERSE TRANSFORM OF 

PHYSICALLY REALIZABLE PART OF K^^ 

NON -PHYSICAL PART OF Ky 



N+5 



t- 



Fig. 7 — The optimum nonpliysical impulse response when ^ is large. 

The class Ci is completely determined by the class d- . In terms of C^ , 
assumption vi of Section 2.2 may be written as follows: 



If Ay is in Ca , ly/^y is in Ca . 



(29) 



3.1 Optimum Nonphysical Network 

The integral in (23) will surely vanish if the integrand is zero identi- 
cally for every permitted Ay . The integrand will be zero if 



Y^, = 



S 



N + S 



(30) 



This Ym , however, is generally nonphysical. First, it is not generally 
regular in the finite part of rhp. Second, when a > it behaves improp- 
erly at p — * w . 

When a is negative, Ym behaves properly at p — * co , but there are stUl 
singularities in the rhp [except when S/iN -j- S) is a constant]. Let a 
equal —p. Then g(i) is an estimate of s{i) for a time seconds in the past. 
If /3 is sufficiently large, Yju can be approximated closely with a physical 
Y* This is illustrated hi Fig. 7, in time-domain terms. Note that the in- 
verse transform of Ym is symmetrical about the time t = 0, which repre- 
sents the length of the time interval from the time for which the signal 
s{t — 0) is estimated up to present time t. 

A large /3 is appropriate for reducing data long after they are collected, 
such as reconstructing the flight of an experimental missile from recorded 
positions or velocities. Then /3 is the length of time by which the interval 
of observation extends beyond the time for which S is to be estimated. 

* See Ref. 9, Section V. 
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When the Ym of (30) is used in (24), the minimum average square error 
is (after some wimple manipulation) 

Then assumption v of Section 2.2 requires 
NS 



N -\- S 



= Oct) ~ when w ^ oo. (32) 



The same restriction on A'' and iSi will surely still be necessarj' (although 
perhaps not sufficient) when the choice of Ym is restricted to any subset 
of the present Cy . 

3.2 Optimum Physical Network 

In this section, we restrict the function class of Cy , from which Ym 
is to be chosen, to the "physical" subset of the class permitted in the pre- 
vious section. 

Cy ^ the class of all physical frequency functions. (33) 

by "physical", we again mean rrhp as in (8) and (9). Since the difference 
of two rrhp functions is also rrhp, (33) implies 

Ci = the class of all physical frequency functions. (34) 

The integrand in (23) can no longer be identically zero for all the Ay 
permitted by Ca , but the restrictions on Aj- are such that they may be 
taken advantage of, in one way or other. Note that the Ca of (34) obeys 
(29), and hence also our assumption vi. 

The optimum Ym may bo determined in the following way: First, (33) 
is applied to (23), to obtain tentative conditions on Ym which are con- 
sistent with (33) and in which Aj- does not appear. As derived, these are 
sufficient conditions. Any corresponding Ym , if one exists, must be the 
correct Ym , but it is not at once apparent that one does exist. The neces- 
sity of the conditions is then established by demonstrating that a corre- 
sponding Ym does, in fact, exist. (Recall that the correct Ym is unique.) 
This is a procedure which is useful in many variations of the smoothing 
and prediction problem. 

The integral in (23) will be zero if its integrand behaves like H(p) of 
(12). By (25), the integrand behaves properly when u> is real and -^ co , 
By (33), the factor A,- in the integrand in (23) is rlhp. Therefore, if the 
remaining factor in the integrand in (23) is made rlhp both (12a) and 
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(12&) will be true, and (23) will be satisfied. Assembling this condition 
with (25) and (33) gives the following set of sufficient conditions on Ym : 

a. Ym is rrhp, 

b. Ym {N + S) - e^^S is rlhp, 

(35) 

c. When w is real and — > «= , 

1 Ym {' N and | Ym - e"" f S^ OoT'. 

It remains to be shown that a Y m meeting (35) does, in fact, exist. The 
demonstration is somewhat different for positive and negative values of 



3.2.1 Prediction for a Present or Future Time (a ^ 0) 

When a is positive in (356) the behavior of e^^S as p — > «> is consis- 
tent with rlhp. Then Ym(p) may reasonably be rational (under our as- 
sumption of rational A^ and *S). When Ym must be physical and a is non- 
zero and positive, | Ym — e"^ \ cannot -^ 0, when w is real and — > oo . 
Then assumption v, as reflected in condition (25), requires S = Oco as 
0} ^ cc . (The limiting case of a = will be examined later.) When 
S — ObT^ as o) — > CO, conditions (35) (^an easily be translated into the 
following set, which refers to poles and zeros in the finite part of the p 
plane : 

a. The poles of Y m are the Ihp zeros oi N -\- S, 

h. At the Ihp poles of N and S 

F„^^.- : (36) 

c. The degree of the numerator of Y m is a mini- 
mum, within conditions a and h. 

These conditions determine a Y m uniquely. The Y m so determined will 
also satisfy (35), provided the degree determined by (36c) is such that 
(35c) is satisfied. It is shown below that (36c) is, in fact, just consistent 
with (35c). 

Recall that N and S are even functions, with half of their zeros and 
poles in each half plane. Also, Y mY m , which is | Ym \^ at real w, is an 
even function, in which exactly half of the zeros and poles are zeros and 
poles of Y M ■ Then, under (36a), the number of (finite) poles of Y mY m 
is exa(^tly the number of (finite) zeros of (A'' + B). On the other hand, 
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the number of zeros of F.,, is one Jess than the number of Ihp poles of 
N + *S', for the scale of Y m can be adjusted, as well as the zeros, in meet- 
ing (?,bh). Then the number of zeros of Y s,Y m is exactly iwo less than the 
number of poles of A^ + *S. As a result, 

I Ym r (.V + S) -^ cjJ as 0) -^ CO . (37) 

Finally, since A'' and .S are nonnegative at real <^ (and hence cannot cancel 
each other at w ^ ^), both 

I Y Si I'A'' and ] Y m \' S = Ow' ' as w ^ '-^ . 

Since we already know that -S - OuT', this is sufficient to establish (35). 

Conditions (30) need further interpretations for special eases. When 
zeros or poles of A^ + >S occur on the axis of real frequencies, they occur 
in identical pairs.* One of each pair is to be interpreted as in the Ihp. 
Certain zeros of A^ + 'S' may coincide mth polos common to A" and S, 
when they are computed from the numerators and denominators of the 
(rational) A" and S. Lhp zeros of this sort are to be retained as poles 
ofK„.t 

When ff ^ 0, the second condition iii (35c) becomes | Y^ — 1 |'»S = 
OuT' as w ^ 30 . So long as S = Ow'" as w ^ ^, conditions (30) are 
still appropriate. Now, however, S may be nonzero, or even unbounded 
at 0) ^ ^ . When S is nonzero but bounded, F,„ -^ 1 as w -^ -=0 . When 
S has poles at oo , (F,„ - 1) ^ c/v"\ When (36) is properly modified 
to include the new conditions, a unique Ym is again determined, which 
again satisfies (35), provided A^ is such that (32) is still satisfied. When 
vS F^ at <>:■, however, (32) will not be satisfied unless A^ = Ooj " as 



3.2.2 Estimatinn for a Pasl Time {a < 0) 
If a ^ — ^, (3o) becomes: 

fl. Ym isrrhp, 

b. Y,,{N + S) - e-^'Sis rlhp, (39) 

c. When lo is real and — * x , 

I F„ I'A^ and | !'_„ - e'^' \'S = OoT'. 
When /3 is positive, e'^^S does not behave at infinity in an rlhp manner, 

* A necessary consequence of the positiveness of spectra N and S. 
t This is confirmed by (36&), which makes Ym infinite at a lhp pole of S wliicli 
is cancelled by a like pole of A' in A' + 5. 
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as defined by (10). Hence Ym(p) can no longer be rational, but must 
contain a term which annuls the forbidden behavior at infinity. 
Let Y/u be represented as \ 

y'- = ^ + w^s'~"- ' '■■^^ 

Then, if (39c) is rearranged with due regard for (32) and the positiveness 

of N and S, (39) now becomes 

a. A -\- ^ _^ ^ e"^" is rrhp, 

b. A{N + S) isrlhp, (41) 
c* Wlien w is real and — > 00 , 

I A \\N -\- S) ^ Oor\ 

The function A can now be factored, recalling Yy of Section 2.3 and 
1%. 3: 

Y,Y, = N + S, 

(42) 
Yp , 1/Yp are rrhp. 

We can multiply the function in (41a) by Yp , without changing its 
rrhp character. (If rrhp, It will remain rrhp; if not rrhp, it will remain 
not rrhp.) Similarly, we can divide the function in (41&) (arbitrarily) 
by ?F . Then (40) and (41) may be written as follows [if (42) is again 
used] : 



hi'^f/")' 



a. B + -^ e"^" is rrhp, , (43) 

h. B is rlhp, 

c. When oi is real and — > =0 , | 5 | ^ Ow~". 

The conditions (43) can only be realized with a unique rational B. 
The poles of B are in the rhp but they are cancelled in F.y by poles of 

* This may be derived in the following way: Use the I'w of (40) in the two func- 
tions in (39c). Add the two functions, rearrange, and separate out the function in 
(41c), making use of the following theorem: Let U\ and U-> be two functions of «; 
if f/i = Otu-2 and fs = Our-", then Uy-Y Ui = 0^-^; conversely, if Ui \- U« = 
Ow~*, and both Ui and U2 are nonnegative, then Ui = Oai"^ and f/2 = OuT^. 
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S/Yf . Because the poles of B are in the rhp, Ym cannot be reahzed 
exactly by a finite network of hnnped elements and delay lines. It can 
be approximated arbitrarily closely, however, by transversal filter tech- 
niques. 

3.2.3 A Time-Domain Interpretation 

A time-domain interpretation of Ym may be derived from (35) which 
coincides with, for example, Bode and Shannon's interpretation.* This 
may be accomplished by using the functions Yf and Yp much as in the 
previous section. 

Since Vp and \/Yf are both rlhp, the function in (356) may be multi- 
plied by 1/Yf without altering its rlhp character. (If rlhp, it will remain 
rlhp; if not rlhp, it will remain not rlhp.) Then (356) becomes 

YmYf - J- e^Ms rlhp. (44) 

Yf 

By (T-13), the inverse transform must be when / > 0. Hence, the two 
terms in the difference must be equal when I > 0. Since Fm and Yf are 
both physical, YmYf is physical and its inverse transform must be zero 
when / < 0. Then (41) becomes, by (T-10), 

e 

Km * Kf = inverse transform of -=r e"'' when t > 

Yf (45) 

= when t < 0. 

Equation (45) determines Km * Kp uniquely. The transform is Y^Yf , 
and Y M can be found by dividing out the (minimum phase, physical) 
factor Yf . Note that (45) can be solved for Km even though the spectra 
are nonrational: F ^ N -\- S must be such that Yf can be found from 
F = YfYf by means of the loss-phase integral; and various transforms 
and convolutions must be evaluated. 

3.2.4 .1 Fiuihvr Interpretation 

A further interpretation of (45) is useful when one seeks certain gen- 
erahzations of the present problem. The time series which are funda- 
mental to the problem are the true signal s{t) and the observed signal 
/(/). When s(t) is to be predicted for a specific time, t + a, from values 
of /(/) observed over a specific interval, (-o© < t < t), only certain 
of the linear covariances, describing statistical characteristics of s and 
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/, are actuiilly pertinent to the problem. The pertinent linear covariances 
are; 

1. the auto-covariance of the obwerved signal / at any two times, li , 
t-2 , in the observation interval; 

2. the cross-covariance between / at any time, ^i , in the observation 
interval and s at the special time t -\- a; 

'S. the expected value oi s', which is the zero lag covarianc^e of s, at 
the special time / + a. 

The auto-covariaiice function $« , or the corresponding spectrum S, is 
pertinent to our present problem only as it affects the covariances of / 
through the relation f(t) = s{t) + m(t). 

It follows from the above that the original Gaussian model may be 
replaced by any other Gaussian model which retains the same pertinent 
linear covariances, without afifecting either F^ or o-,w . For example, the 
model represented liy Fig. (i may be replaced by that shoum in Fig. 8. 
The network with fret^uency function Yy generates an f{() with the 
correct auto-('ovariance (in accordance with Section 2..'i and Fig. 3). The 
network with fretjuency function 1%-.^ , in the presence of the other 
network, .supplies the correct cross-covariance between /(/i) and s{f + a). 
(The cross-covariance arises from the sharing of the same white noise, 
by s and/.) Note that Y^.s in Fig. 8 is exactly the function (.S/Ff)^'''' 
in (44). The single (time-invariant) Gaussian random variable Vs con- 
tributes only to 5, and gives the correct variance of s when added to the 
contribution from the white noise. Note that the variance of v„ is exactly 
the tr.if" of Section 3.1, corresponding to the optimum nonphysical F,^ . 



(time - INVARIANT) 



t> 



s(t +«) 







Yfs 




, 




. 




f(t) 




WHITE NOISE 


Yf 


1 


SOURCE — 


-*■ 


— 


•■ 


Yp 



Q 



f (t](ERROR) 







gtt) 


NOISE 


v.Vc 


— >■ 





Yp Yp = S + N 



Yc^ = 



e«P 



/+CO 



Fig. 8^ — An alternate physieal model, which retains the pertinent covariances. 
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Because Yf has l>een chosen in such a way that 1/}'^ is physical, the 
white noise of Fig. 8 can be recovered. If YmYf = Vrs , the estimate 
g(l), of s{t + a), is exactly the contribution of the white noise to s(l + a), 
and the error in the estiniutc is exactly r. . If }'.., must be physical, 
YmYf can yield only the contributions to .s(/ + a) from present and past 
values of the white noise. This is roughly the description of the solution 
used by Bode and Shannon (in Rof. 9, Section VII). 

3.3 Optimum Nchvorh with a Finite Memory 

In Section ;i.2 we assumed that all past values oi J{t) (the signal plus 
noise) were available, back to ^ = - -^ . Furthermore, the Y m{v) of (^55) 
does generally lead to the utilization of all past \'ahies. In practical 
applications, the corresponding A' .1,(7) is usually \'ery .small when the 
age of data t is sufficiently large, say t > r,„ ; and data older than t„, 
may reasonably he neglected. Frequently, however, /(/) is a\'!iilable only 
for a smaller interval, say < r < 7'; and then the procedure must be 
revised. This is the central problem considered by Zadeh and Ragazzini. 
It may be referred to as the "finite memory" problem, as opposed to 
the "infinite memory" considered in Section ;^.2. 

We can restrict ourselves to values of /(O no older than T hy making 
our function cla.^s C,- correspond to physical networks with memories 
which extend only T seconds into the past. With fixed networks of this 
sort, T is constant, and the used data begins at a variable time t — T. 
When /(O is aviiilable from a fixed starting time, /n , to present time /, 
T is a function of t. Then a \'ariable network must be designed with a 
response which equals or approximates the response of a different fixed 
network at each dift'erent t. 

Our smoothing and prediction device will remember only T seconds 
into the past if the impulse re.sponse K^ is restricted by 

A'fi(/) - 0, except when {) <i <T. (46) 

An equivalent fre(iuency domain restriction may be deri\-ed as follows: 
Referring to Fig. 9, if Kail) = except when < / < T, it follows that 
Ka{-t) = except when -T < t < Q. But then Ka[-{t - T)] = Q 
except when < t < T, which meets the conditions for physical- 
ness. If Ya is the transform of A'„ , then Yc.e^'^" i.s the transform of 
Ka[-{t - T)\, by (T-5) and (T-9). Thus (46) corresponds to 

Cy is the class of functions Kg(p) such that: 

a. Yg is rrhp, 

(-17) 
6. YtiC ^^ is rrhp. 



(48) 
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Since differences of Ka& of the form (46) also obey (47), it follows that 

Ca is the class of functions Aj- such that: 

a. /liiY is rrhp, 

h. Sye"^** is rrhp. 

Note that (48) is consistent with (29) and hence also with assumption 
vi of Section 2.2. 

The following is an important property of functions in the class (47) : 
Replacing Y a by Y g maps .singularities from either half plane into the 
other. Then (47) excludes singularities from all finite parts of the p 
plane. Let rfpp mean "regular in the finite part of the plane". Then, 
(47) implies that 

Yo is rfpp. (49) 

There will be no Y m of the class (47) which satisfies (35). Hence, a new 
relation must be derived from (23). To take advantage of (47), Y m may 
be expressed in the following arbitrary way: 

Ym = A-\- Be"''''. (50) 

If this expression is used in (23), the integral in (23) may be divided into 
two parts, as follows : 

f + ieo C + too 

[A{N + S)- e'"S]^y dp + / [B{N + *S)]e-''''Ar dp = 0. (51) 
too J—ica 

The two integrals in (51) will be zero, individually, if the two integrands 

behave like H(p) of (12). This will be true under conditions noted below. 

In the first integrand, Ay is rlhp. In the second, Aye"^'' is rrhp by 



^ 


(a) 



T 




t 



^ 


(c) 



Kg (t) = INVERSE TRANSFORM OF Y (p) =0 WHEN t < 
Kg C-t) = INVERSE TRANSFORM OF Y(p) =0 WHEN t >0 
KG[-Ct-T)] = INVERSE TRANSFORM OF VCpJe^^p =q wHEN t < 



T 



Fig. 9 — Impulse responses of finite duration. 
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(48). For (12a) to be true, the other factor in eju^h integrand must be 
reguhir in the same half plane. The convergence (condition (12&) must 
also be applied, and it can be applied if (25) is properly taken into ac- 
count. When (47) also is transcribed the following conditions may be 
assembled as sufficient conditions on A and B: 

a. A -\- Be~^^ is rrhp, 

h. B + Ae^'"" is rrhp, 

c. A{N + S) - t^'i' is rlhp, 

d. B{N + S) is rrhp, (52) 

e. When co is real and —* « , the following functions = OciT': 

\AfN, \A- e"' \'S 
I B IX 1 B fS. 

Functions A and B which satisfy (52) do, in fact, exist; hence the 
conditions are necessary as well as sufficient. As in Section 3.2, the 
solution takes different forms for positive and negative values of a. 

:).3.i Prediction for a Present or Future Time {a ^ 0). 

When a is positive in (52c) A and B are rational. By the definition of 
rrhp in (8), conditions a and b will be satisfied if the rational A and B 
satisfy (49). When S = OuT' as w — * «= (as it must if a is nonzero and 
positive), (52) may be translated into the following set of conditions, 
which refer to poles and zeros in the finite part of the p plane: 

a. The poles of A and B are the zeros of # + *S', in both half planes, 

b. At the Ihp poles of N and S 

O on 



A = 

rhD Doles of ;V and -S are zeros of B, 

(53) 



N + S ' 

c. The rhp poles of ;V and -S are zeros of B, 

d. At the poles of A and B 



_ 1 - e-^' 
B~' ' 



c. The combined degrees of the numerators of A and B are to be as 
small as possible and the relative degrees are to be adjusted as 
requii-ed by (52e) . 
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K^(t) = INVERSE TRANSFORM OF ACp) 
Kgd-T) = INVERSE TRANSFORM OF BCp]e"'''P 

Fig. 10 — Inverse transforms of A and Be"'"''. 

An analysis of degrees, us in Section 3.2.1, shows that (53) determines 
a unique ,4 and B, which satisfy (52) when .S = ObT^ as oj — > co . When 
a = Q, S is not necessarily at oo . Then (52c) imposes additional condi- 
tions on A. When (53) is suitably modified to include these new condi- 
tions, a unitjue A and B are again determined. 

It is interesting to examine the forms of the inverse transforms of A 
and Be'^", which together make up the inverse transform KM{t) of 
K.m(p). The inver.se transforms of general rational functions are ecjual 
piecewise to sums of exponentials. The exponentials are different for 
^ > and t < 0, and there may be discontiiuiities (and also 5 functions) 
at I = 0. In Be~'^'', the discontinuities are displaced to i ^ T. Then the 
two time functions may take the form suggested in Fig. 10. 

As in Section 3.2.2, Y m cannot generally be idealized with a finite net- 
work of lumped clcmonts and ideal delay lines. It can be approximated 
by a transversal filter; plus difTerentiations if K M{i) is not a bounded 
function. The transversal filter need not involve transmis.sion times 
greater than T. If preferred, the rrhp parts of A and B can be separated 
out and realized with a finite network of lumped elements plus a single 
delay line. 

3.3.2 Prediction for a Past Time {a < 0). 

When a = —Q, (52c) excludes a rational A . Recalling (40) , we let A 
be 



A = D + 



S 
N -\- S 



„-3p 



(54) 
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Then when a = -^, (52) becomes the following:* 

a. Z) + ^-^ €-'•• + Be~'" is rrhp, 

b. B + De-'" + ^^^ e-''-"" is rrhp. 

c. D{N + S) is rlhp, ^_^^ 

d. S{iV + S) is rrhp, 

e. When w is real and -^ ^ , the following functions = Ou ': 

1 D |lV, 1 -D fS 
1 B I'.V, I B \\S. 

Consider the term e"*'"^'" in (55b). When /3 < T, the exponential 
l)phaves at oo in a manner suitable for an rrhp function, as defined in 
(8). Then D and B are both rational, and can be calcuhited from zero 
and pole conditions implied liy (55). 

When p > T, the exponential p""'"'^*^ is not suitable for an rrhp func- 
tion, and it must be annulled by an exponential term in B. Let 

B = J --^e-''~'"\ (56) 

Using both (54) and (56) in (50) and (52) gives 

a. O + Jc^"^" is rrhp, 

h. J -i- Be"''" is rrhp, 

c. D{N + -S) is rlhp, 
(/. J{N + S) - .SV-"*^^'" is rrhp, 

c. When w is real and — * » , the folloAving functions = OuT': 

1 D \'N, I D fS 

I ./ \'N, 1 J - e-'^-^'' r.S. 

When > T,D and ,/ are rational, and can ho found from (57). 

A comparison of (57) and (52) indicates the following relationship if 

* Because N and S are even, i?= A' and S = S. 

fin making the t-ompiirison, recall that 1' is rrhji when Y is rlhp. and vice 
versa. 



(57) 
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If Yu is the optimum Yp for estimating s(t + a), then Ymc'^^ 
is the optimum Ya for estimating s(t — T ~ a). 



(58) 



The two times, / + a and t ~ T — a, are symmetrically located relative 
to the interval t ~ T < t < t, iov which values of f{t) are available. 
The relation (58) may be viewed as a direct consequence of the sym- 
metry of auto-covariance functions. 

3.3.3 A Time-Domain Interpretation 

In Section 3.2.3 we derived Bode and tShannon's explicit time-domain 
solution of the infinite memory problem. When memories must be finite 
there are difficulties which exclude an explicit time-domain solution of 
the same sort. The difficulties themselves are informative, however, and 
a simple time-domain analysis also furnishes an alternative derivation 
of (52). 

The starting point is now (26) in place of (23). Like Kg , Ak is now 
restricted by (46). Then the limits of integration may just as well be 
reduced to < t < T and (26) becomes 



I [Km{t) * ^f{t) - ^b{t -F «)]A^(t) dr - 0. 



(59) 



The variation function Ak can have any arbitrary values at times within 
the interval of integration. Therefore, the other factor in the integrand 
must be zero, over the same mterval < t < T. Referring to Fig. 11, 
we may divide it into two parts, one of which is zero when t > 0, 
and the other when t < T. These may be written as follows; 



Km{t) *Mr) - *.(r + a) = Ku{t) + Kv{t - T), 



(60) 



Ku(t>^~N, 




"^^Mt-T) 



Ku(t) = O WHEN t > 
^^V ft ) = O WHFN t. < O 

Fig. 11 —The functions iv„ and K", . 
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in which 

Kv{t) - when r > 0, 

Kv{r) = when r < 0. 

Take the transform of both sides of (60), and recall that the trans- 
forms of K^,,^,, $.s iire Ym,N + S, S. Then 

Ym{N + S) - e'"S = U -{- Ve-'^, 

U is rlhp, (Gl) 

V is rrhp. 

Compare this with (356) of Section 3.2. 

In Section 3.2.8, we divided (35b) by Yf to obtain (44) and a simple 
physical interpretation. What happens if (01) is divided by F? ? The 
result is as follows; 

Y.Y, f/ - f. ?/ ' 

t//ff isrlhp, ^*^^^ 

V/Yf is not rrhp. 

Because V/Yf is not rrhp, the inverse transform of the left-hand side 
is not zero in the interval < t < T (nor in any other significant hiter- 
val). Furthermore, while a Y mYf which is rrhp still implies that Ym is 
rrhp, it is not simple to solve for YmYp in such a way that Y mg"^^ is 
also rrhp. Thus division by Yf is no longer useful. 

To obtain an alternative derivation of (52), use the Yu of (50) in 
(61). Then 

{A + Bc''"'){N + 5) - e'-^'S = U + Ve''''', 

U is rlhp, (63) 

V is rrhp. 

Equate, separately, the terms which do and do not involve e~'^^, and 
cancel the rrhp factor c^^" from those that do. Then 

^(.V -I- S) - t^^S = U, which is rlhp, 

(64) 
B{N -\- S) = r, which is rrhp. 

These equations correspond exactly to conditions c and d of (52). 
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3.3.4 A Somewhat More General Method 

In sonic pr(>l)lcnis (50) imist be vcphu-cd by !i slightly more geiienil 
expression. In jiaiiiculnr, <i>,s-(T + a), the lust term in thcstiiiare brackets, 
sometimes must. Ik; replaced l)y a more general function of r, say Kdr). 
This is charat^teristic of problems involving linear constraints, which 
will he discussed further in Section IV. Tire methods described above 
may be modified in such a way that the transform of Ka(r) need not be 
rational, provided $f still corresponds to a rational spectrum. 

Suppose V M is to lie determined from conditions of the following sort 
(in whi(^h V m , A,- , F, Yc are of course the transforms of K_\, , A^ , *f , 
Kc): 

a. [ [KAt)*^,(t) - Kc(r)]AM dr = 0; 

Jo 

h. A:.,,(0, A/c(/) - except when < / < T 
[then r„ , A,, belong to the classes (47), (48)]; 



(65) 



c. I Y M \'F^ = Occ ', when w is real and — > <:o ; 

d. */.'(0 = '"1 auto-covariance function, 

F = Yi.-('p)Y^-(p) ^ a rational function; 

e. Kr-(l) = a given time fun(^tion, Y,'(p) need not he rational. 

Let a new time function Kdl) with transform Yc(p) be defined as 
follows; 

lir = Kc(t) when ()</.< T, 

(66) 
- when t < or > T. 

Then, exactly as with Y^f , we must have 

Ycip) i^nd Fce"^'' are rrhp and rfpp. (67) 

In these terms (05) implies the following, replacing our previous (GO) : 

Km{t) *Mr) - Kc{t) ^ Ku(t) + Ky{t - r), 

Kuir) = when r > 0, (68) 

.^^(t) - when t < 0. 

The functions Kv and Kv are like Kv and Kv of (GO), except that they 
include the "tails" of Kc lying outside the interval < r < 3\ 



LINEAR LEAST SQUAKES SMOOTHING AND PIIKDICTION 1257 

Now take the trnnsfoi-ms of nil functions in (08), and solve for Ym{p). 
If ty and V are the transforms of Kv and Kv , the result is 

i M ^ , 

o. t/isrlhp, (69) 

h. V is rrhp, 

c. I Y M \'F = OuT' when w is real and — » oo . 

The functions f/ and V must be such that Ym is rfpp. But Yc itself is 
rfpp. As a result, 

a. The finite poles of t7 = the rhp poles of F; 

h. The finite poles of V ^ the Ihp poles of F; 

c. If p ^ pk at a finite zero of F, ^70) 

Yr{p>.) + Uijh) + V(p,)f-''"'- = 0; 

d. When co is real and —*«>,[/ and V must behave 
in such a way that | Ym \'F ^ Ooj ~. 

These conditions determine a unique rational V and V when /'' is 
rational, pro\-ided Kr{t) satisfies certain re(|uirements relating to 
cuntiiniity. If /'' — * C'co""'" as w — > <» , then A't^/J and its first m - 1 
derivatives must be continuous in the interval < / < T". When the 

continuity condition is violated, there will be no K.,, which meets all the 
conditions ((io), luiless ((ioc) is modified. (See Section 4.3 for an inter- 
pretation in connection with a particular application.) 

The transform Y ri'p) of Kr{f) need not be rational. Furthermore, 
when Kc{t) i« given the time function can be cakadated without actu- 
ally evaluathig )'r(p), except at the special points p - pi,- . In particu- 
lai-, the following two I'elations may be used in e\-ahiating the inverse 
transform of the right hand side of (09), with help of (70): 

"■ '''■("') - W. f ^''^<^^'"'"' '-'■ 

b. Inverse transform of —^ = —7= K.:(T)Ka/F){t - r) dr, ^'^^ 

I< V^TT Jo 

where K(mf){I) = inverse transform of \/F. 
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3.4 Simultaneous Optimization of Two Network Functions 

In each of our problems so far we have been required to find a single 
frequency function, Ym{p), which is to represent the optimum choice 
of a single linear operator, Yoip). There are other problems, however, 
in which two or more frequency functions are to be found, corresponding 
to the simultaneously optimum choices of two or more different but 
related linear operators. This section develops general methods in terms 
of one such problem. 

Suppose the same signal can be observed in two different ways, or at 
two different places, involving contamination with noise from two 
difTei'ent (uncorrelated) sources. An important example which mil be 
discussed later (but not the only example) is the observation of a single 
physical variable with instruments of two different kmds. There are now 
two observed time functions, fi(t) and f2{t), related to s{t) by 

/,(() = sit) + ni(t), 

(72) 

/sCO = s{t) + n2(t). 

The two different time functions are to be modified by (different) 
linear operations, and the results are to be added, to obtain an optimum 
estimate of s{t -\- a) . All the assumptions of Section 2.2 are to be retained. 
Thus, Gaussian ensembles may be substituted for s{t), ni{t) and n2{t), 
and Fig. 2 may bo replaced by Fig. 12. 

From Fig. 12, the following integral is easily obtauied, in place of 
(19): 

rj' - I {\Yaifm-\-\ Yo^ P iVs + I Yai + Yo2 - e"'" \' S) do,. (73) 

J— (O 




SOURCE NO. 3 



Fig. 12 ^ A physical model foi- a signal which is observed in two different ways. 
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Here A''i and A'^3 are, of course, the power spectra of the two noise ensem- 
bles (assumed to be uncorrelated). Let Ym and F,,/2 be the optimum 
choices of Yai and Vb2 , and define An and Ai-2 by 



(74) 



(75) 



Yoi = Yiii + Ari , 

Y02 = r«2+ Ay,. 

Then (73) may be replaced by 

^- ^ r^ (I Ymi r A^i + i Fm2 r A^2 + I Ymi + Ym2 - e"' f S) rfco 

J— CO 

+ [ (I An r iVi + 1 Ay2 r iV2 + 1 An + Ak2 f S) dw 

J— to 

+ 2 /" " {[r3M(A^i + S) + 7,,25 - e'"'S]Ayi 

J—a> 

+ [FmiS + Ym2{N, + 5) - e'"'.S]Ai.2t rfco. 

The first integral in (75) is om', corresponding to a An and Ai-a which 
are zero identically. The second integral is positive for any An imd 
Ay2 which are not zero identically. Then, if Aj-i and Ai-2 each obeys (29), 
the last integral must be zero. This replaces (23) by the following: 

For every permitted Ayi and Ai-2 : 

r*" {[Y,n{Ni + 5) + Y,nS - e'"'S]Ay^ t^G) 

J— ioo 

+ [YmS + Ym2{N, + .S) - e'"S]Ay2] dp = 0. 

The one integral can often be split into two, each involving one Ay , 
and each equal to zero. The separation may not be permissible however, 
when Ni/S = Ou"^ at w = ». An examination of convergence rules, 
noted below, will clarify this statement. Convergence conditions may 
again be obtained, by excluding at once, all Y^'s and Ay's for which 
CT^ = 00 . Conditions are easily derived from the first two integrals in 
(75). These maj'- be combined to obtain a condition on the integrand in 
(70). The result is 

When o) is real and — * 00 , the following seven functions = Ow ' : 

1 Fm TiVi , I r.w2 fiVs , I r^i + }-.„2 - e"" fs, 

(77) 
I An \-Ni , 1 Ay. 1^2 , 1 Ayi + An \S, 

and the integrand in (76). 
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Note that ] Aii l"*'^ and | Ay« \'S do not necessarily = Ow~", provided 
they are restricted by a linear relation at w ^ co ,* 

3.4.1 Optimum Physical Networks Whoi a ^ 

Suppose Ymi and }'.i,2 'Avc to be selected from the class of all "physical" 
fretiueney functions, that is, from the Cy of (33). Then An and Ayz 
belong to the Ca of (84). FoUowing our usual procedure, we fit the inte- 
grand of (7G) to H{p) of (12). Behavior at co js already proper, and 
Ayi and A 1-2 are both rlhp. Then we have only to make the two assisociated 
factors also rlhp. Assembling these requirements with conditions of 
physicahicss and convergence gives: 

a. Y^a is rrhp, 

h. Ym2 is rrhp, 

c. Y.»i{N, + ^) + Ym2S - c'^'S is rlhp, 

d. F,«:S + Ym2{N2 + S) - €""8 is rlhp, 

e. When w is real and — > oo , the following three functions = 

Ol^r-■. 

I Y^, I'iVi , I Y,„ fN2 , ] Y,n + Y,n - c"" \'S. 

When a § 0, Y^i and Ym2 are ratioual (assuming rational iV"i , A'"2 , »-S)] 
It is not at once apparent, however, what the poles of Ym and Ym2 wil 
be. From (78c) and (78d), we can define two rlhp functions, say Ui and 

C7.,by: 



(78) 



r,a(iVi + S) + Y,nS - e'"S = h\ , 

y^nS + Y,nXN2 + S) " e-^S = t), . 
Solving for ^Afi and Y ^^ gives: 

tj,N2 + (C7i - h\)S + e"'SN2 



(79) 






N1N2 + (A^i + N',)S 
U2N1 - (Ui- U2)S + e^'SNi 



(80) 



N,N2+ (A^ + A^2)*5 
Since Y,\n and Y m are to be rrhp, while Ui and U2 are to be rlhp, the 



* For example, suppose jVi , jVs = Oo3~* as w — > 00 , and S = OoT^. Then | An | 
and I Ai ■■ |- may = Oq>^-, provided Ayi aiid Aj-s are so related that [ Ayi + Ar2 |" = 
OC. But this Afi and Ars cannot be chosen entirely independently, and then the 
integral in (76) must not be split in two. 
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two p;\ii-.s of functions can have no finite poles in common. Then every 
finite pole of Y.m or l'.v2 in (78) must belong to one or the other of the 
following classes: 

a. The Ihp zeros of .Yj-V, + (A^ + Ah)S; 

h. For Ymi , any Ihp poles of A^ or S which are not poles of ,g^. 
N1N2 + (A^i + N2)S; for Ym2 , any Ihp poles of A^ or ^ 
which are not poles of N1N2 + (A^ + A''2)^. 

The second class occurs only in special or degenerate eases, such as 
when zeros of A^ + A^ happen to coincide with poles of S. 

Mtcv the permissilile poles have been determmed by (81), (78) may 
be used to find the numeratorri of Ym and Ym^ ■ Simultaneous linear 
ecjuatiouH in the numerator coefficients may be derived innn re(|uired 
behavior at Ihp poles of *S', A'': , A'^2 and also at poles of Ymi and Ym 
themselves. For the latter, take the difference of (78c) and (78d) to 
get: 

{YmN, - 7,,2.¥2) is rlhp. (82) 

Numerators of the minimum degree are determined uniquely, and also 
turn out to be just compatiljlc with (ISe). In special or degenerate cases, 
certain of the poles, permitted by (81), may coincide with zeros of the 
corresponding numerators of Ymi and K.,/2 and (^an be cancelled out. 

3.4.2 Oplimtim Physical Networks Whni a < 

When a - -/3, with /3 > 0, Sections 3.2.2 and 3.4.1 may be combined 
to get 

^'-'" ^'^''^ N,N, + (AT, + NdS ' 

, Koo) 

'""- -^AVV2+ (A^i + A^2)5' 

Here, .li and A2 are two ditferent rational functions, not necessarily 
rrhp (even though adding the exponential terms must make Y m . 
Ym2 rrhp). Substituting in (78) and using a generalization of (32) gives* 



* I'iirjilk'liiig Section 3.1 gives, in place of (32): 



NiNi + {Ni + N2)S 



= Ow ^ as oi 
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0- -^2 + TTTT i — 777 i — "HVYa 1^ rmp, 

N1N2 + (A^i + N2)S 

c. AiiNi + S) + A2S is rlhp, (84) 

d. AiS + A2{N2 -\- S) is rlhp, 

e. When w is real and —> <» , the foUowiiig three functions = Oui~ : 

\A,fNi, \A,\'N2, \A,-\-A,fS. 

In (84), conditions a and b put the rhp poles of ^i and A2 at the rhp 
poles of the functions which multiply e . Conditions c and d permit 
the same Ihp poles as the poles of Y^i and Fa/2 for a > 0, which are 
determined by (81). Combinmg the Ihp and rhp conditions shows that 
every finite pole of Ai or A2 must belong to one or the other of the fol- 
lowing classes: 



a. The zeros of A^iA^2 + (N, -\- N2)S; 

b. For FjKi , any poles of N2 or S which are not poles of -A''iA'^3 + 
(A''! + N2)S; for Y^ , any p 
poles of iViiVa + {N, + N^jS. 



(85) 
(A''! + NijS; for Ym2 , any poles of A^i or S which are not 



These are merely the extensions of the classes (81), to include poles 
in both half planes. 

The poles of Ai and A', are twice as numerous as the poles of Ymi 
and Ym2 for a > 0. As a result there are more numerator coefficients to 
be determined, but they are still determined by simultaneous linear 
equations. 

3.4.3 A Time-Domain Interpretation 

Paralleling Section 3.2.3 gives a time-domain interpretation. Although 
it does not represent a significant simplification of the present problem, 
it does hnve theoretical interest, and also a potential usefulness in varia- 
tions of the present problem. 

In Section 3.2.3, we multiplied the rlhp function in (356) by the rlhp 
function 1/Ys , so as to obtahi the rlhp function (44), in whi(^h the 
rrhp term Y mYf may be regarded as the only unknown. We now multiply 
the two rlhp functions in (78) by two different rlhp functions and add 
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the result. The object is to obtain an rlhp function with a single unknown 
term which is itself rrhp. 

Let Qi and Q2 be two (as yet unknown) functions of p, both required 
to be rrhp. Then Qi and Q^ are both rlhp. If we multiply the two rlhp 
functions in (78c) and (78(^) l>y Qi and Qo and add the result the sum 
will have to be rlhp. The sum can be written as follows: 

{YmYa + YmY,) - e-^SiQ, + Q2) is rlhp, 

Y. = Q,{N^ -\- S) + Q,S, (86) 

Y, = QiS + Q2{N, + S), 

If Fa and Yb are both rrhp, as well as Ym and F.wo , the function 
(YmYa + Y.MiYh) will be rrhp and can be found from (86) by the 
Bode-Shannon method. This function can be used to express (82) in 
terms of }'.i/i or )',W2 alone. Thus, F.ui and Y m may be found by a se- 
quence of straightforward calculations, as soon as Qi and Q2 are known. 
The prol)leni is to find a Qi and a Qi such that !'„ and Yb are, in fact, 
rrhp. It may he more informative to use the etiuivalcnt requirement 
that F„ and ft, mvist be rlhp. Then (since Ni , N2 , S are even), (86) 
requires 

a. ?„ = Qi{N, + S) + Q,S is rlhp, 

h. n = QiS + Q2{N2 + S) is rlhp, (87) 

c. Qi , Q2 are rrhp. 

Equations (87) are aljout as hard to solve for Qi and (h as (78) is for 
Fjtfi and Y »•, . The poles of Qi and Q^ are exactly the poles of Fmi and 
Y M2 ■ The numerators are different and are not uniquely determined. 

Note that the calculation of Qi and Q2 does not involve the terms 
c"''S of (78). These enter oidy in the Bode-Shannon type of analysis. 
The method has potential usefulness in variations of the present prob- 
lem, in which the terms c"^S are replaced by more complicated functions. 
(Compare this section with Section 3.3.4). 

3.5 Samplcd-Data Si/stcms 

This section sliows how the methods which we applied in the previ- 
ous sections to contimious-data systems may be modified for application 
to discrete, or sampled-data systems. ^Methods appropriate for discrete 
systems can, of course, be derived without reference to continuous-data 
systems; ;uid they have been so derived by, for example, Levinson (Ref. 
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1, Appendix B), and T.loyd and Mc^Milhui."' It is interesting to observe, 
however, how simply tlie transformation may Ix' accomplished, from 
techniques for continuous-data systems to techni(iues for discrete-data 
systems. For this purpose, it will be sufficient to outline an appropriate 
procedure, without filling in details. 

Two different kinds of discrete-data systems may be considered. The 
signal and noise may be discrete time series, described by statistics 
appropriate for such scries. On the other hand, the signal, s{t), and the 
noise, n(/), may themselves be continuous time series, with the obser- 
vations of signal-phis-noise, f(f), limited to discrete "sampling" times. 
Methods appropriate for both kinds of discrete-data systems may be 
derived from a study of continuous-data systems of a spec^ial kind. 

3.5.1 A Special Kind of Continuous-Data System 

Following tlie usual methods of "z transform" theory, let z ^ c~'^^, 
in which 7^ is to he the sampling interval. Eational fun(^tions of z, when 
treated as functions p, now have inverse transforms which are zero 
except at the discrete times aT. The methods which we have applied to 
continuous-data systems do not apply directly to spectra, S and A'", 
whi(^h are simply rational functions of z. The behavior at p = «> leads 
to divergent integrals such as, for example, the integral in (;U). On the 
other hand, spe(^tra of the sort described below meet the convergence 
conditions and also lead to Ym(p) which are exactly rational functions 
of z. Then the onlj^ data which are actually utilized are the data observed 
at the discrete times / — aT. 

Note that z = l/z, and that \ z\ > I in the left half of the p plane. 
Use the notation Yip) = Y,(z) and Y^(z) = Y,{l/z). In these terms, A^ 
and S are to be 

a- S = — A_F,f,, 
1 -\- e-oj- 

''■ ^ = rqrsi ^»^- (88) 

c. Yft^(z), Ynz{z) arc rational in z, 

d. Yaz , 5'jv. , 1/Ys, , 1/Yn, arc regular at \z\ < I. 

The constants g and e are to be real and positive. 

Suppose a > 0, in s{t + a), and Ym is to be the optimum "physical" 
Yg , when A'^ and .S are de.scrihed by (88). The procedure explained in 
Section 3.2 is easily adapted to the new problem. Corresponduig to 
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(35), one gets: 



n. yM{p) isrrhp, 

b. —^ — . Yziij) is rlhp 

1 — €'p- 



(89) 



'IT 



c- I Yzz I < ^ at z ^ » . 

Tlicse coiiditioiis arc satisfied by a F.m(p) such that Y m,{z) is rational 
and meets tlic rollowiiig eonditions: 

a. Ym,{z) is regular when \z\ < 1, 

b. Yz.iz) is regultu- when | 2 | > 1, 

c. Yz.iz) = when 1 + e/> - 0, 
c/. < I rv,(2) I < CO at2 - ^. 

Tiie factor <j/l + eV was applied to .V and S in (88) to obtain cou- 
vcrgence of integrals in the derivation of (89) and (90). However, we 
can make e as small as we wish, and Ym approaches a reasonable limit 
as e — > 0. The poles of Y m, are completely independent of e. The numera- 
tor approaches a limit such that condition (90c) disappears, and (90c?) 
changes to 

Y^^(^) = Q at z = ^. (91) 

This follows from the fact that (90r) requires Fv, to have a numerator 
factor in z which is zero at the zero of 1 + t/) — namely, a factor 

1 _ e'^'^'^z - 1 - r-<^/<'<^+'^'. (92) 

When € is small, but not zero, the autocovariances corresponding to 
the N and .S of (88J take the form shown in Fig. VA. The "spikes" become 
sharper as e becomes smaller. As e ^ 0, the integrals of the N and S 
of (88) be<-ome infinite unless q also ^ to order of e. The function Yu , 
however, is entirely independent of q, which need be considered quauti- 
tatively only in calculating the corresponding variance, an'- 

3.5.2 Discrete Time Series 

Suppose S and -V are the spectra of continuous time series s{t) and 
n{l), such that the co\'ai-iances arc exactly zero except when the lag 
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Fig. 13 — A special form of auto-covariance function. 

time is aT, with a an integer. Let the prediction time a also be an inte- 
gral multiple of T. The corresponding Y m will be such that the estimate 
g{i), of s{i + a), will depend only on the values assumed by s{t), and 
n{t) at the discrete series of times t — aT. Since the values of s{t) and n{t) 
which occur at intermediate times are not utihzed, and are not corre- 
lated to the values which are utilized, they can have no significant re- 
lation to the problem ; they may be regarded as undefined without al- 
tering the solution. Thus the special continuous-data system yields the 
same Y m as though s{i) and n{t) were discrete, ordered sequences of ran- 
dom variables matching the values which are assumed by the con- 
tinuous time series at the discrete times <tT. 

Conversely, when s and n are initially discrete series, corresponding 
continuous ensembles may be constructed for calculating the optimum 
smoothing and prediction. When the covariances of the discrete series 
are sums of exponentials in m^ — tui , where mi and m^ are the order 
numbers of the samples involved, the correspondmg S and A'^ will be 
rational functions of 2 = e"^*", and Y m and Km may be found by apply- 
ing Section 3.5.1. Many details, of course, remain to be filled in. 



3.6.3 Discrete Samples of Contintious Time Series 

Now suppose that s(t) and n(t) are continuous time series with ra- 
tional specti'a S and A'^, but that/{0 = s{t) + n{t) may be observed only 
at discrete "sampling times", i = uT. Let "present time" t coincide with 
one of the sampling times and consider the calculation of the optimum 
Yu for estimating s{t + a), with a > 0. 

In Section 3.2.4, we replaced the Gaussian statistical model shown 
in Fig. 6 by that shown in Fig. 8, which has the same pertinent statisti- 
cal characteristics. Now, the pertinent statistics are even more restricted, 
and a further change in the Gaussian model is useful. In particular, the 



LINEAR LEAST SQUARES SMOOTHING AND PREDICTION 1267 

frequency funt-tions Yf , Vfs and the single random quantity v^ may be 
modified iji any way which leaves the following covariances unchanged 
(in which (t is a nonuegative integer): the auto-covariance of /(/) at the 
discrete lag times aT, the cross-covariance of /{/) and s(/) at the discrete 
lag times (a + <jT) and the auto-covariance of s{f) at the lag time zero. 

Our present problem may be solved by changing Yf , Yfs , Ns (with- 
out changing the pertinent statistics), in such a way that the optimum 
Ym{p), computed on a continuous-data basis, will be the transform of 
a Kifit) which is zero except when t = aT. The output of a correspond- 
ing device, at any sampluig time, will depend only on the values of the 
input at sampling times. A suitable mechanization for the sampled-data 
system, derived from this Ym or Km , may be either a fixed network 
(or equivalent device) with sampling at input or output, or digital com- 
putations carried out once each computing cycle. 

When N and S are rational functions of frequency, Yf and Yfs may 
be changed to rational functions of s, and Section 3.5.1 may be applied 
to find Ym as a rational func-tion of z. When a = mT, with m an hiteger, 
it is sufficient to change A'^ and S to suitable rational functions of z, 
preserving auto-covariauces of s{t) and n{t) at lag times aT, and then to 
apply Section 3.5.1. When a 9^ niT, however, Yfs must be modified 
further (or the related function Se""). As in the previous section, many 
details remain to be filled in. 

3.6 Nonstationary Sijstems 

In this section, we consider signal and noise ensembles which are 
statistically nonstationary. Some years ago, Booton"' described integral 
equations which determine the corresponding linear least-squares 
smoothing and prediction operators. The general integral equations, 
however, are very much more difficult to solve than are the equations 
corresponding to stationary systems. On the other hand, the techniques 
described above, for stationary systems, may be paralleled, in prhiciple 
at least, for analogous nonstationary systems. More specifically, a one- 
to-one correspondence may be established between the individual rela- 
tionships and operations used m the techniciues and a new set of relation- 
ships and operations which are appropriate for at least a large class of 
nonstationary systems. 

While it is clear that the new relationships and operations are appro- 
priate for many nonstationary systems, the exact range of their appli- 
cability has not been fully established. In addition, the manipulations 
required in numerical problems are very much more complicated than 
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the stationary counterparty and they may be feasible only for quite sim- 
ple systems, even when electronic computers are available. Accordingly, 
a very brief outline may be sufficient here, leaving a fuller exposition 
for a later paper. 

3.6.1 A Class of Nonslaiionary Systems 

In previous sections, we considered finite networks of lumped elements, 
driven by white noise, as physical models which generate stationary 
Gaussian ensembles with rational spectra (Figs. 2, 3, 6 of Section II 
and Fig. 8 of Section 3.2.4:). We now change the picture only by permit- 
ting the network components to be time variable; w^e thereby define a 
class of nonstationary Gaussian ensembles, analogous to the stationary 
ensembles which have rational spectra. Miller and Zadeh " have studied 
essentially the same class of nonstationary ensembles, in somewhat dif- 
ferent terms. 

Let V be the input and E the output of a finite network of lujnped, 
linear, time-variable elements. Then 

PV - QE, 
p = «. + „,_+...+^,. (33) 

The coefficients a„ , ba , H are now generally time- variable. 

Consider the functions Ua , defined by PL^^ = 0, and the functions 
Lj , defined by QL„ ^ 0. There will always exist n linearly independent 
Va'fi and m linearly independent Lo's. The Cv's may be referred to as 
the basis functions, or bf's, and the Lo's as the zero-response functions, 
or zrf's. In the theory of nonstationary systems, the bf's correspond to 
the poles of the admittances of stationary sy.stems, and the zrf's corre- 
spond to the zeros. In general, the bf's and the zrf's may be chosen in 
any of various ways, shice linear transformations of legitimate choices 
are also legitimate choices. Fref[uently, however, a specific choice is 
particularly useful, as in the situation described below. 

Existence theorems are more easily established if the coefficients cia , 
h„ , // in P and Q, when viewed as functions of time /, are required to be 
analytic at / = — ^. Then, except in degenerate cases which need not 
he considered here, the bf's, L'„ , and the zrf's, L„ , may be so chosen that 
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'23 



thpy hclmvc as follows, in Eiccordance with Bellman: 

a. When n„ ,b^ , H arc analytic at / -^^ — oo , 

h. Then f ^„ ^ r-'e""'' and U ^ ^-V'""'. 

The coefficient.s y„ and /j„ are constants. The p„'s may he described fur- 
ther in terms of P and Q of (93). Let FJp) and Q^(p) be the polynomials 
<lori\-cd IVoni /■* and Q in the following way: Replace d/<lt by p. l''or co- 
efficients, use the wilues asr^umed by the Oo's and K's at / = — =». Then 
the p,''s arc the zeros of P^{p) and the pf's are the zeros of Q^{p). The 
same sort of conditions for "physical" networks may now be applied to 
the p,"s and p/''s as to the poles and zeros of stationary admittances. 
(The coeffi<-ients must also behave "reasonably" in some sense, although 
not necessarily analytically, at times other than I ^> ~ ^ .) 

When the ;;„"s ha\e negative real parts, as required for "physical" 
networks, the integration of (98) gives (for m < n): 

a. Vit) = [^ Kit, T)E{r) dr, 

h. K{1, r) = E UM)X,{t) when r < t, ^'"^"^ 

^ when r > t. 

Here, K is the impulse response, as in previous sections, but it is no 
longei- a function of the single variable t — t. Tlie L^^r's are again the bf's 
of the dinVrential equation (93) and the A'„'s are new functions. Their 
calculation is at least straightforward, provided the fv's im: known, as 
well as the coefficients of Q in the difTerential e(iuation. The A'^'s are 
roughly analogous to the residues at the poles of the transfer admittance 
of a stationary network. (The functions X„/l\ are a closer analog.) 

3.0.2 Manipulations of Differential Equations 

When a time-variable network is made up of networks in tandem or 
networks in parallel the differential e(|uation for the complete network 

may l)e found from tlie dilTcrcntial eciuations whicli correspond to the 
different parts. The processes are analogous to, but not the same as, the 
algel)raic products and sums which are used to combine the rational 
admittance functions of stationary partial networks. The differen<'eM 
may i)c said to stem from the noncommutability of time-variable coeffi- 
(■icnt.s and the derivative operator, d/dt. 

(c^ V 9^- CV, when C is time-variable. J 
\ dl dt / 
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Conversely, differential equations such as (93) may be decomposed 
into sets of simpler differential equations, corresponding to partial net- 
works connected either in tandem or parallel fashion. For tliis, liowevcr, 
one needs to know the bf's of the differential equation (and also the 
zrf's, for tandem circuit decompositions), just as one needs to know the 
poles (and also the zeros, for tandem circuit decompositions) when de- 
composing the admittances of stationary chcuits. The operations are of 
course analogous to, but not the same as, the factoring and partial frac- 
tion expansion of admittance functions. 

Manipulations of the sort noted above arc described in more detail 
in a pre\'ious paper Ijy the author. 

3.6.3 Aido-Covariances 

When the input signal V is (unit level) white noise, the auto-covari- 
ance, ^, of the output signal, E, may be calculated from the impulse 
response, K. When K is as in (95/>), * may be expressed as a somewhat 
similar finite sum. More specifically, 

/+00 
KiU , r)Kih , t) (It, 

n 

b. *(^2 ,ti) =J2 UMWM when k < U (96) 

ff=i 

n 

= Z WM2)UM when ti> k. 

The Ua's are again the basis functions or bf's of the differential equa- 
tion (93). The IFff's are new functions, which may be calculated in any 
of various ways. 

A differential equation, corresponding to $, may be defined as the 
equation connecting two time functions, say Gi and Ga , in such a way 
that 

Gi (k) = * (h, ti) Gi (td dti . 

The differential equation is of order 2n. Its bf's comprise both the U„'s 
and W^'s of (96), but its zrf's can be found only by solving a homogene- 
ous differential equation with time-variable coefficients. 

In (96a), K(li , t) may be replaced by K^ir, ti), where K" is the 
"adjoint" of K, defined as the function obtained from K{t, r) by inter- 
changing, within the function, the input time, t, and the output time, L 
The integral of the product K(t2 , TjK^ir, ti) is a convolution, represent- 
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ing the impulse response of the tandem combhiation of two networks. 
The two networks are the original network preceded by a (nonphysical) 
network with the adjoint response, as in Fig. 14. 

We may now relate G, and G2 through an intermediate variable, say 
Gm , by means of 

h. PG2 = QGrn . 

Here, P and Q are as in (93), and P" and Q" are similar operators, which 
lead to the adjoint K"{t, t) of the impulse response K(t, r). 

When systems are stationary, the transform of * is the product 
Y{p)Y{p). When systems are nonstationary, the bf's and zrf's of (97a) 
are analogous to the poles and zeros of Y{p), just as the bf's and zrf's 
of (97i>) are analogous to the poles and zeros of Yip). The simple product 
of Y and Y is replaced by the construction of a single differential equa- 
tion, from the two equations of (97), through the elimination of the 
intermediate variable G„ . One half of the bf's of * are exactly the 
bf's of K, and one half of the zii's of * are the zrf's of IC. The other 
halves of both the bf's and zrf's of * are not exactly the bf's of K" and 
the zrf's of K, except when systems are stationary. 

The function 4>(^2 , ti) is symmetrical in to and /i . As a result, * is its 
own adjoint. The self-adjoint property of $((2 , h) corresponds to the 
evenness of the frequency function Y(p)Y{p), which was noted in the 
analysis of stationary .systems. 

When (uncorrelated) signal and noise ensembles each ha\'e auto- 
covariances of the general form (9(ib), the auto-covariance of the signal- 
plus-noise is *« -f *,v , and it has the same general form. The C/o's and 
Tr„'s of *F simply comprise all the C/„'s and TF^'s of ^.s and ^k . 

3.6.4. Ati Analog of the Bodc-Shamion Method 

Now consider the calculation of linear least-scjuares smoothing and 
prediction operators when auto-covariances, $s and ^n , of signal and 




/too I'+oo 

KM'-.t,)G,{tJdt, 03(12) = / K(t2,r)G^(3-)dr 

Fig. 14 — A (nonphyeical) network whose impulse rcHponse is *(/". , d). 
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noise are known, have the form (96b) and satisfy condition (94a). The 
method described below is analogous to the method of Bode and Shan- 
non, as interpreted in Section 3.2.4. 

We begin with the physical model illustrated in Fig. 15, which is 
exactly like Fig. 8, of Section 3.2.4, except that the (now nonstationary) 
impulse responses of the networks have the form K{t, r), instead of 
K{t — t) [the inverse transform of Y{p)]. The method described in Sec- 
tion 3.2.4 can be adapted to the model of Fig. 15, in principle at least, 
provided an impulse response Kpii, t) of the form (96b) does, in fact, 
exist which has the following properties: it must turn white noise into 
an /(/) = s(l) + n{t) with the required auto-covariance, $f(/2 , ^i); it 
must bo "physic^al"; there must exist a "physical" impulse response, 
Kf~ {t, r), which turns /(i) back into the white noise. 

The manipulations described in Section 3.6.2 may be used to decom- 
pose the differential equation correspondhig to ^f into a pah" of differ- 
ential C(|uations which are at least superficially like (97a) and (97b). 
Under condition (94a), a particular decomposition will always have the 
following properties: the orders of the two equations arc the same; the 
two equations have identical coefficients // of (93) ; the bf 's and zii's of 
(97a) arc all "nonphysical", whUe those of (97b) are all "physical" [as 
determined by the real parts of the p^'s of (94b)]; the corresponding im- 
pulses will be respectively /v/ and Kf provided they are, in fact, ad- 
joints, each of the other. 

While a rigorous proof has not been completed, there is strong evidence 
that the two impulse responses will, in fact, be adjoints when derived in 
the manner descrilied from auto-covariances of the form (96b) subject to 
the condition (94a). The same probably holds true in many situations 
where (94a) is not satisfied. When (97a) and (97b) have been determined, 
Kf may also be found, l\y merely interchanging F and Q in (97b) and 
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Fig. 15 — A physical model of a nonntationiiry system. 
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then calciiliiting the coiTespontliiig impulse response. It will lie physical 
because the zi'f's oi' K hiive been made physi(^al. 

For the most part, the calculations cioscribod above are sivaightfor- 
ward but laliorious. The bf's of the differential equation for ^y are the 
basis functions [''„ and ir„ of <t,s- and $iv , and are known when <l>.s and ^n 
are known in the form (90(j). The zrf' s o{$f , however, must bo calculated 
as solutions of a homogeneous linear differential equation with tinic-vari- 
ablo coefficients. This is analogous to finding the zeros of the spectrum 
N + .S when given rational spectra N and .S. The computational diffi- 
culties, however, are very much greater, and they are likely lo limit ap- 
pli<'atioiis to quite simple problems. 

3.6.5 An Analog of the Zero and Pole Method 

Some of the laborious calculations (but not the calculation of zrf's of 
^f) may be avoided by techniques analogous to the zero and pole tech- 
nitjues described in Section 3.2, etc. For definitcncss, let the optimum im- 
pulse response, Km{1, t), he chosen from the class of all stationary and 
nonstationary physical impulse responses (infinite memories included) 
and let the prediction interval a be greater than 0. Methods similar to 
those used in Sections 2.G and 8.2.3 yield a Wiener-Hopf eciuation of the 
following sort [a generalization of (45)]: 

(' K,,{t, li)Mt-- , /i) dh = Mt + a, /,), ^2 < /. (98) 

J— CO 

When *,■ = *.s + ^A- , and 'I>.s , *,v have the form {9Gb), (98) becomes: 
T.UM2) / KAl,ti)WMdh 

n J— « 



+ 



J^WXl--) [ KM(J,ti)UMi)dk 



\ -= EwMu,{t'^<^),f^< f- (99) 



The sum J^^ includes those terms in J^^ which are contributed by *s , 
but not those from ^k . 

A differential equation mny l)e derived from (99) which is exactly the 
differential e(iuation corresponding to *f , with Km taking the role of the 
"input" function, Gi(fi), and the right-hand side of (99) taking the role 
of the "output" function 6^2(/2). Time t is a constant in the deri\'ation and 
analysis of this differential equation. Then G-iiti) is a sum of some of the 
bf's of the differential equation. As a result, Ka, must be a linear combina- 
tion of zrf 's of */.■ , except that 6 functions may be permitted at the limit 
of the integration, ti = t. More exactly, Km is a linear combination of the 
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physical zii's of ^y , plus possible 5 fuiiutions. This is aiuilogous to con- 
dition QMm) of Set'tioii 3.2.1, which makes the poles of F,,/ the Ihp zeros 
of N -\- S. 

The coefficients of the various terms in the linear combination may be 
determined by using the hnear combination with general coefficients, in 
place of Km in (99). Evaluating the integrals yields a linear (combination 
of the t/j's, which are the physical bf's of ^p . Since the combination must 
be identically zero [when the right-hand side of (99) is included], the net 
coefficient of each U„ must be zero. The result is a set of linear equations 
which determine the coefficients of the various terms in Km . The equa- 
tions are analogous to the conditions (366) on the behavior of Yn at 
Ihp poles of A'^ and S. 

IV. FURTHER SPECIFIC PROBLEMS AND APPLICATIONS 

The central problems described m Section III may be adjusted to fit 
various engineering prolilems. The adjustments, however, may require 
changes in various details. The examples described in this section illus- 
trate both engineering usefulness and ^vays in which details may be 
changed. Some of the examples represent existing engineering applica- 
tions. Some are merely potentially useful. Others are of interest primarily 
for theoretical reasons. The specific changes in the central problems are 
reviewed in more general terms in Section 4.6. 

4.1 Problems Related to Anti-Aircrafi Fire Control 

The correct aimmg of anti-aircraft artillery depends, fundamentally, 
on data smoothing and prediction. It also illustrates several of the ways 
in wtiieh the central problems may be modified to meet practical rec^uire- 
ments. The anti-aircraft problem, as such, will not be developed in more 
detail than is needed for purposes of ilKistration. 

An anti-aircraft projectile must be aimed at a predicted future position 
of the target, for which the prediction is based on positions of the target 
observed at present and past times. The observations (optical or radar) 
are contaminated by fluctuatmg observational errors, corresponding to 
some statistical ensemble. The true motion of the target also corresponds 
to a statistical ensemble, for the direction of flight and the speed will 
change with time, in ways which are not entirely predictable. Then the 
observational errors correspond to our noise, n{t), and the true position 
of the target corresponds to our true signal, s(t). While the problem is 
really three-dimensional, it will be sufficient for our purposes to consider 
a single one-dimensional component. 
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It is generally reasonable to use a Gaussian model for the observational 
errors. The observations, however, are hkely to be available only for a 
limited interval. Thus the finite-memory form of the general problem 
should be assumed. 

The true-signal (target) statistics are not well represented by a Gaus- 
sian model. Furthermore, the average square error is not a reasonable 
criterion of accuracy, without a careful interpretation. (The average 
s(iuarc error gives most weight to large errors, while the "kill probability" 
depends on the frequency of small errors.) A somewhat nonoptimal .solu- 
tion is generally accepted, as described below. 

Most anti-aircraft fire control systems are designed around the fol- 
lowing assumption : 

In the absence of observational errors, the future position of a . 
nonaccelerating target is to be predicted perfectly. 

A nonaccelerating target flics a straight-line, constant speed course. 
Under (100), the actual errors in prediction will depend on the actual 
observational errors and on the actual accelerations of the target during 
the combined observation and prediction intervals {t — T to t + a). 

The future position of a nonaccelerating target may be determined 
from its present position and present velocity. Hence, physical lino^ar 
operators con satisfy (100). Generally, the truly optimum linear operator 
cannot be very different. Anti-aircraft systems must have a high per- 
centage accuracy. That is, position errors must be small compared with 
typical distances from tracker to target or from present target position 
to predicted future position. Fmlhermore, the signal ensemble may 
generally be regarded as centered about the nonaccelerating target 
courses. Therefore, the truly optimum linear operator must sum the ob- 
servations in a. way which comes very close to giving perfect prediction 
under the special condition stated in (100). Then small changes will 
make these particular predictions perfect. 

Let X be any one component of position. Then 

x{t + a) = .-c{t) + ax. (101) 

Here, x is the average rate of change of .r, averaged o\er the prediction 
interval, / to / + a. I'nder (100), separately optimum estimates of x(f) 
and X, used in (101), give the optimum estimate of x{t -H «)■ (A proof is 
not needed for our present purposes.) In anti-aircraft systems, errors in 
x{t) are usually less significant than errors in ax, and they are generally 
less subject to reduction by data smoothing. Then attention is centered 
on the optimum estimation of x. 
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The optimum estimation of .t is described in Sections 4.1.1 through 
4.1.4. In these sections, our signal, s(t), becomes the true velocity x{i). 
Instead of predicting x{t + a), we are to predict x, which is the average 
of x{() over the interval t \o t -\- a {x in a functional of x). The prediction 
is to be obtained by linear operations applied to observed positions x(i). 
Since a factor p in a frequency function corresponds to differentiation, 
we may write 

g{t) — .T + e(0 = x{t) modified by a linear operator Yg(p)p. (102) 

Here, Yaip) represents the data smoothing applied to the apparent rate 
of change, /(O ^ observed x{t). The error part of /(O is described by the 
spectrum I 

N - Y^Y^ ^ JN. , (103) 

where A^^ is the error spectrum for xit) itself. 

4.1.1 Optimum Measurement of a Constant Velocity* 

In this section we assume the following conditions: 

a. Positions x are observed from t — T to t, 

h. Conditions (100) are to be satisfied, (104) 

c. The true x{t) is constant from t — T to t -{- a. 

When condition (100) is satisfied and the actual x(l) is constant, the en- 
tire error e(f) must be due to errors in observation. Then, by (103) and 
the noise part of (19) or of Fig. 2, 

(t' = f \Yo\' N^o:' dco. (105) 

J — 00 

When x{t) is constant, present x(t) = x, and prediction time a need 
not appear at all. Then (1046) requires that Yg(p) applied to a constant 
must yield the same constant. Since the response to a constant signal C 
is CFgp(O), we now have 

Ct is the subclass of the class (47) such that Fo(0) - 1. (106) 

If Y„j(0) = Ygk{0) = 1, the difference Av(0) - 0. Then 

Ci is the subclass of the class (48) such that Ar(0) = 0. (107) 

Note that (107) is consistent with (29). 

* An P!ir!v t reiit merit of this problem, yielding solutions of special cases is 
included in ;i wartime report bj' Phillips and Weias.^* 



LINKAIt LKAST SQUARES SMOOTHIXCi AND PREDICTION 1277 

The optimum operator Ym is now the Yg of the chiss (106) which mini- 
mizes it' ill (105J. As in Section '.\.'^, use 

F„ - A + Be-'', 

Ym , YmC ^ are rrhp. 

Equation (51) may now be adapted to the present problem as follows: 
Omit terms proportional to .S. Replace A'^ by iV^w , or, in terms of 
p^ = —(/, by —N^p'. To take account of (107), rearran{j;e the inte- 
grands, so that Ai-/p becomes the variation factor rather tlian Ay [(107) 
excludes poles of Ay/p at p ^ 0]. This changes (51) into 

/+IC10 Xy /-+" 7iy 

{AN^py) — dp+ iBNrp-p)e~''" — dp = 0. (109) 

,00 P -Lgo P 

Section 3.8 may now he paralleled further, to assemble conditions 

corresponding to (52). Since there are no terms in e"'' in (lO'J), a rational 
N^ leads to rational A and B. Then (52a) and (526) may be replaced by 
(49). The resulting list of conditions is as follows: 

a. A + Be-'''' = 1 at p - 0, 

b. A + Be'^" is rfpp, 

c. ylA^;,p' is rlhp, (110) 

d. BNjP^ is rrhp, 

e. When oj is real and — ^ t» , 

I A f N,J and | B f N,J - Oc^''. 

When ATi is rational, these conditions are just sufficient to determuie a 
rational -1 and B. 

As an example, suppose N^ is* 

,V. = 1 ^f^ = 1 ^^fi?^ . (Ill) 

TT OJo" + CO" TT 6)0" — p- 

Then (UOc), (llOrf) and (UOp) restrict A and B to 

^ ^ ai + 02?) + fl.ip' ^ _ ^1 + M + ?J3p' Q^2) 

* The scale factor has been designated in such a way that a^^ i.s in fact the 
average squared jjosition error which is related to A'^ by 
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Given (112), one cjin use (llOa) through (llOfi) to find aj , a2 , a^ , hi , 
Ih , hi . The resulting Y ^ may be arranged aw follows: 



Ym = -12J 



(l+^)(I -cv) (l -^)(1 +cp) 



-Tp 



in which the constants J, c are related to T, on, , by 



, (113) 



^4(-a' 



J = 



(114) 



1+^+ '' 



Ttoo ' T^wfl--' 



The corresponding impulse response Km(0 may be arranged as follows: 
KmU) = JK,{i) + (1 - J)IUl), 



i<S) = T^KT - t) when < I < T, 



(115) 



Ut) = ~ 



when < t < T, 



Kiit) = 0, K^{t) = when i < or > T. 

The two functions K^i and K2 are the unit-area parabola and unit-area 
step shown in Figs. 16(a) and 16(b). Then Km is the combination shown 
in Fig. 16(c). 

The minimum a' determined by (105) may be found by evaluating 



{A 

CO 



+ Be-''n{A + M^Nj doi. 



(116) 




(a] PARABOLA 



(b) STEP 




(C) PARABOLA + STEP 



Fig. 16 — Parabolic smoothing. 
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The evuhiation is complicated by the presence of the two exponentials 
in the integrand, and the mnltiple poles of A and B at p = 0, as deter- 
mined by (112). The complications may be resolved by splitting the in- 
tegral into the following two integrals: 



^^ = - f ^" (a + Bcr^" - — ^) An.p' dp 

__ r^'" \^^ + (Ae"-" + B)BN.jf'] dp. 
J-i^ |_1 + c/> J 



(117) 



Each integrand = Ow"^ when u is real and -* «> ; it is reg;ular at p = 
[by (llOn) and (1106)]; and it involves only a single exponential. The 
first integrand is rihp; the second is rlhp except for a single pole of 
ANiP/{1 + cp) at p = -wo . Closing the contours of integration by 
suitable arcs at '^ , and applying the residue theorem gives 

/ = 2J^^. 018) 

4.1.2 A While Noise Approximation 

The constant J hi (113) and (114) is a function of Tm . Generally, 
Two is quite large. In the hmit, as Two^ =o, ,/^ 1 and (113), (115) and 
(118) become 



r,„ = 12 






e-'^ 



24 (7/ 12 






T' wo T 



28 



The parabolic smoothing represented by (119) was derived by Bode 
in a quite different way. 

Note that iV^(O) is the spectral density of the position errors at low 
frequencies. The result, (lU)), may be obtained more simply by substi- 
tuting the constant A^(0) for A^(p) m (110). 

The nature of the approximation is explained further by I'ig. 17. 
Curve (a) represents a more general A^ , including a "spike" atw = to 
allow for drift errors, etc. Curve (b) indicates the (qutilitativc) nature of 
the <^orresponding function 1 Ym fw'. The variance, a\ is the integral of 
the product of these two functions. Therefore, the value of iV» is unim- 
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Fig. 17 — Explanation of a white noise approximation. 

portant in regions where | !',„ |'w" is very small. Thus, the white noise 
spectrum c gives aljout the same result as the actual spectrum A'^^ . 

4.1.3 Constant Target Accelerations 

According to (US) and (119) increasing T always data-eases o-^. The &^ 
of (118) and (119), however, is not the total variance af of the predicted 
X, except when the target acceleration may be neglected. The effect of a 
given target acceleration hicreases as T increases, and this limits the 
values of T for which (118) and (119) are good approximations to o-il 

A higher order of approximation takes account of target act^elcrations, 
but assumes tliat they are constant over the interval ( — T to t -{- a. 
If the y.« of (119) is retained unchanged, ai' becomes 






(120) 



Here, a-/ is the (ensemble) average scjuared acceleration of the target. 

A smallei' (t/ may l)e obtained by modifying the conditions which de- 
termine I'm in a way which takes account of target aceelerations. When 
the expected accelerations are large, it may be reasonable to strengthen 
(100) to the following: 



:i21) 



In the absence of observational errors, the future -position of a 
target with constant acceleration is to be predicted perfectly. 

The stronger condition reduces the function classes Cy and Ca to the 
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subclasses of (47) and (48), such that 

a. Yaip) -^1 -\- hap as ?? -> 0, 

h. Ay(p) < cp' as p — > 0. 

These changes in C- iuid C^ lead to the following changes in (110): 
condition (110a) is replaced by (122a); the factor p in (llOc) and (IKW) 
is replaced by p*. The white noise assumption leads to the following, in 
place of (119); 

/c,,,(o^«/(r-4i+^:^^(T'-2o], 

^' ^ ' -^ (12:0 



fr[i + -0+^)] 



= ^ 1 1 + io 



.V.(0). 



If the expected acceleration does not have a large effect in (120) with 
the largest T for which observations are available, full compensation may 
not be justified. (The compensation increases the sensitivity to observa- 
tional errors unless T can be increased.) A possible, though more compli- 
cated procedure is then as follows: Assume that the target acceleration 
is nonzero but time-invariant, and that the ensemble average of the 
squared acceleration is ai. Then minimize the combined elfccts of the 
target acceleration and the observational errors. 

The combined cffet^ts now give 



f = j I r„ 1= TV^co- dw + a/ 



Yn - 1 



2j.=o 



(124) 



To keep the second term finite, (106) and (107) are again in order. For 



a minimum ai , 



£",,... f:M..+.[f(^-i)L.- 0,0.) 

in which the integrals are the .'^ame as in (109). The conditions (110^ are 
now modified to permit simple poles of the two integrands at p = 0, 
which <'anccl out in their sum. The two contours of integration may be 
indented at p = 0, provided they are indented in the same way. Condi- 
tions (1 10c) and (llOrf) permit one contour to be closed at infinity around 
the Ihp, and the other around the rhp. Then the residue theorem makes 
one integral proportional to A,/p at p = 0, and parameters can be ad- 
justed to cancel the last term in (125).* The result is 

* Tlie contour mav be indented to |)a^s to either .side of p = 0, prnvided tlie 
indentulion is the same for both integrals. The final re^siilts are the same because 
the poles of the integrands at p =0 cancel out in their .sum. 
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Km=^, t{T - (l + Q p^^ + "^ {T - 2t) 



-^=^^+^[l5(l+^y]^.(0)K 



127r 



Q = 



.• , 720 ,- ,„. 



4.1.4 Mure General Target Accelerations 

Comparing (123) with (120) shows that compensation for constant 
target acceleration increases the sensiti\nty to observational errors, un- 
less T is increased. (This is because the observation of acceleration, as 
well as velocity, is included implicitly in the new Ym .) Sensitivity to 
flu(^tuations hi the target acceleration is also increased, and becomes 
greater as T increases. In principle, Ym can be modified further, to give 
perfect prediction hi the absence of observational errors, whenever the 
acceleration is matched perfectly by, say, an nth degree polynomial with 
random coefficients.* The sensitivity to observational errors will be fur- 
ther increased, unless T is increased. However, any reasonable a(fcelera- 
tion ensemble which involves only a finite number of random parameters 
will lead to a F„ nwh that a" ^ as T ^ ^. 

The infinite memory problem may l^e handled more reasonably by 
assigning a spectrum, N^ , to the target accelerations. Then (105) may be 
replaced by 

'^l = r^ \\ ^« r ^^^' -\- Ya- "'"■"" ^ r (^\] do:. (127) 

If the optimum impulse response Kmit) corresponding to K,w(p) is neg- 
ligible when / > /,„ , there may be no advantage in extending the inter- 
val of observation to times older than t — t^ . The practical significance 
of a quantitative limit may be weakened, however, by the non-Gaussian 
character of actual target stiitistics. 

4.2 Measurements with Multiple Inslrumenl^'f 

This section describes further the two-instrument problem noted in 
Section 3.4. It will be assumed that s{t) is to be estimated for present 

* Blackmail^" has related the corresponding K_nf{l) to Legendre polynomials. 

t The use of multiple instruments is described in more detail in reports*' ^- " 
relating to specific applications. Principles are described in papers by Bendat'* 
and Stewart and Parks. ^^ 
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time, t, and need not be predicted for a future time, t + a. When a = 0, 

(73) becomes 

^^ = /""" ( 1 Y^, 1= iVi + I Ya, 1' A^2 + 1 Ya, + IV, - 1 f S) do;. (128) 

J— 00 

In (72) flit) and Mt) are now the results of observing a. physical varia- 
ble, s{i), with two different instruments. Then ^^ and N-> are the spectra 
of the instrumental errors ni{t) and n2(t). 

4.2.1 Elimination of Errors Proportional to True Signal 

When the two instruments have reasonably high percentage accuracies, 
S » Ni and iVa . Then the Yai and Yai which minimize a^ in (128) will 
make I'oi + Va^ - 1 very small. Under these conditions, it is reasona- 
ble to approximate the true optimum by making the factor exactly zero. 
Then l'o2 is related to Yai by 

Yo, ^ I- Yoi. (129) 

Under (129), the .S term in (128) drops out. Using Ya to represent Y ai 
and 1 - Yd to represent Yg2 gives 

a' = f """ ( 1 Yo r iVi + I 1 - I'a r N^) (!<,. (130) 

J— 03 

Now, ff" is to be a minimum with respect to the single-frequency function 

Ya. 

Foi-mally, (130) is exactly the same as (19) (with a = 0). Rational 
spectra A^ and .S have merely been replaced by rational spe<'tra iVi and 
Ns . Thus the whole of Sections 3.2 and 3.3 may be applied. 

4.2.2 Determination of Position from Position and Velocity Measurements 

As an example, suppo.se an aircraft's position is measured with one 
instrument and its velocity with another, and that the two measurements 
are to be combined to determine the present position to high accuracy. 

Let Nx and iV„ l)e the error spectra of the position and velocity observa- 
tions. Then, if all errors are referred to positions, Ni = Nx , N-2 = N,/w , 
and (130) becomes 



/ = ['^(\Yo\'Nx^\l -Yol'^yi^. 



(131) 



In (131), a' will lie bounded only if l'fl(O) = 1 [assuming that 
Nr{0) 7^ 01- When Y,]{0) - 1, applying Ya to a constant position leaves 
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the position unchanged. This situation may be interpreted as follows: 
By (13), any Yo applied to the true position yields a weighted integral 
of the true positions existing during the "smoothing interval". When 
5'fl(0) = 1, the weighted integral is a weighted average, and the position 
measurements are used to determine a weighted average position, aver- 
aged over the smoothing interval. When the position is not constant, the 
present position will generally be different from the average, and the dif- 
ference may be calculated from the velocities observed during the aver- 
aging or smoothing interval. The correction may he said to "update" 
the average. The first term in the integrand in (131) corresponds to errors 
in the weighted average position, determined from the position measure- 
ments, and the second term corre.sponds to erroi-s in tlio updating, deter- 
mined from the velocity measurements. 

If the position measurements are used alone and if guessed velocities 
are, in fact, (]nile uncertain, an adetiuate smoothing interval may lead 
to large updating eri'ors. On the other hand, if the velocity measurements 
are used alone, they must be integrated over the entire time of flight, and 
velocity errors may accumulate into large position errors. Thus, the two 
instruments together may give a very much higher acemvacy than is 
possible with either insti'ument alone. This may be explained further by 
citing differences between the spectra A''^. and N,./(a} , and comparing Ya 
and 1 — I'o , in (131), to the transfer functions of a pair of separating 
filters. 

4.2.3 P recalculated Iiifunnalioii 

Sometimes, the second measurements may be either replaced by, or 
augmented by previous uonstatistical information concerning the physi- 
cal variable. The "biased statistics" may be taken account of as follows: 
Let so{() be a precalculated "nommal" s{l), which may be regarded as 

the ensemble average of 5(0. Then let 

«(0 - Sr{t) + so(0, 

m - m + .0(0. , ^^3^^ 

git) = gr{l) + So(0, 

S = spectrum of Sr{t) ensemble. 

The time scries Sr{t) with spectrum S may be regarded as the error in the 
prediction of s{t} without measurements, by precalculation alone. The 
time function gM) is to lie ol)tained by applying operator I'ofp) to 
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f^(l) ^ f{() - snil). Then ^'(0 >"^^y be found by iiddmg s,>(t) to gr{t). The 
eiToi- integral (128) is unchanged, pro\-ided S is redefined as in (132). 

If Yg2 ^ 0, in (128), the second measurements arc replaced entirely 
by the precalculation of suH), and the error in the second measurements is 
replaced by the error in the precalculation. If Yg- is neither nor 1 — Yai , 
the estimate of s{t) is based on three sources of mformation : the two kinds 
of measurements and the precalculation of sn(f). Whether the full gener- 
ality is justified depends on the relative magnitudes and spectra of the 
three corresponding errors. 

4.3 A Signal Detection Problem. 

This section describes a simple problem related to signal detection. 
The time function f(t) = s{t) + n{t) is again observed and g{t) is again 
produced by applying, to f{i), a physical linear operator Y„(p). Now, 
however, the true signal s(/) has the following properties : it is a time func- 
tion which has finite duration and a known shape, but which starts at an 
unknown time. It may be represented as follows: 

s(i) = r(t - /i), ii < t < h ^ w 

- 0, t < h or > li-\- w (133) 

/i = a random variable. 

In the absence of noise, the response g{t) will have a maximum value at 
some value of (/ — /i). The contribution to git) from the noise n{t) will 
have an rms value. We are to find a particular linear operator Y^ which 
minimizes the ratio of rms noise response, to maximum response to true 
signal. 

Since only ratios are of interest, the scale of I'm may be chosen to give 
a unit maximum true response. Then the problem is to minimize the rms 
noise response within this constraint. Given any valid solution for Y ^ 
producing a maximum true response at, say, ^i + t„, , there will be equally 
valid solutions producing maxinunn responses at later times. The oper- 
ator r.„ can always Ijc multiplied by e'^^", representing an ideal delay /3. 
Thus, if /i -H /„, is the time of maximum true response, t^ may be treated 
as an arbitrary parameter, provided the final results are examined, to 
determine what values of /„, are, in fact, valid. 

When t = li -\r t,„ (133) makes s(^ - r) become r{t^ - t). Then Ym 
is the physical I'c which minimizes the following a', subject to the fol- 
lowing constraint: 
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.+00 



L 



+00 

r(t^ - t)Kq{t) dr - 1. 



= [ i Yo\'N di^, 

J— 00 

(134) 



Let Yr{p) be the transform of r((„ — r), regarded as a function of r. 
Then, by Parseval's equation (T-14), the constraint may be wTitten 

Yrfado: =1, ^ ^ 

(135) 
Yrip) = transform of r{t„, ~ r). 



£ 



The isoperimetric method of the calculus of variations may now be 
applied in the following way: When Yo is replaced by Y^ + Ay , (134) 
and (135) each yield an integral which must vanish. The two integrals 
may be summed in arbitrary proportion to get 

(Y^,N' - kYr)Aydp = 0, (136) 



I 



in which k is an initially undetermined constant. The methods of Section 
3.2 then givef 

YmY^ - A:I^isrlhp. (137) 

Yn 

These can be solved for the physical Ym , in either frequency-domain or 
time-domain terms. The resulting Ym is proportional to k, and k may then 
be determined by (135). 

iVfter further manipulation, the corresponding noise ratio turns out 
to be 

^ /.. - 1 

<Tm — h — — , 



[Krs{r)f dr 



rr f \ ■ . f f ^riv) (138) 

KrNKT) = mverse transform or , ■-■■ , 

Yn{v) 

= r(t„t — t) * { inverse transform of -n ^ { ) . 

In the formula for o-^, the integration runs from to co . In general, 
KrNir) ^ when t < 0, but increasing t„, shifts KrAr) along the time 

t Recall the definition of Fjv by A'" = Fjv?jv , in Section 2.3. 
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axis in the direction of r > 0. Then the corresponding uj must decrease. 
It approaches a nnnimum value asymptotically, as /„ becomes so large 
that the "tail" of \KrN{r)f at r < becomes negligibly small. 

When N is either a constant or the reciprocal of a polynomial i/Ys 
is lit most a polynomial in /> and Ksir) = wlien t > w, whicli is the 
length of the signal ('(/ - /i) as defined ui (IHIi). Then a' reaches its 
asymptotic value as soon as /„, ^ w. When N has zeros at finite values 
of p, a J approaches its asymptotic value only when /„ exceeds w by the 
cITcetive correlation time of the spectrum l/N. 

When A'^ is a constant, corresponding to white noise, the optimum im- 
pulse response is A'.w(r) - r(/„, - t). Then Km may be described as a 
mirror image of the given signal form, /(/ - ti), as illustrated in Fig. 18. 
This is an old principle described, for example, by North."' The more 
general solution, for A'^ ^ constant, has been described by Zadeh and 
Ragazzini.'^ 

A variiition of the present problem restricts the class d- , of permitted 
frequency functions, Yg , to the finite memory class considered in Sec- 
tion 3.3. The problem may be solved by combining the methods of this 
section and of Section 3.3.4. 

When A'" ^^ as w ^^ ^ , it is easy to find pulse shapes such that the 
values of Section 2.2 and 2.3 regarding behavior at ^ are violated \mless 
/.■ = in (13(i). The corresponding tr,,." does, in fact, = 0. An explanation 
is as follows : 

Suppose the {m - l)th deri\ative of r{t — /i) is discontiiuious and 
consider a Ya{p) which approaches cp„. as p -♦ «= . The corresponding 
response to r{t - ti) will mclude a d function and its maximum value will 
be 00 . If, at the same time, the noise spectrum A^ = OuT' "' as w — > w , 
the rms response to the noise will be bounded and the ratio of rms re- 
sponse to noise to maximum response to true signal will be 0. If 
^ __» ci^-^"', the rms noise itself will be co , but it can be shown that the 
ratio of noise to maximum signal response will .still be 0. Compare these 




Fig. 18 — Time fum-tions which are mirror images. 
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conditions with the continuity conditions described in Section 3.8.4. 
Actual pulse shapes niid noise spectra are not likelj' to liehave in this way, 
but whiit appear to be K<>ofl approxinialions may. Care must be used to 
choose iippioximations which beha,\e properly as p ^ ^ . 

4.4 .4 Principle Relating to Diversity Systems 

This sec^tion adapts tlie method of Section .3.4 to the following prob- 
lem: A single signal s{t), with known spectrum *S'(w), is observed by m 
different devices. (The.se niay be, for example, the rcc^eivers of a communi- 
cation system using the diversity principle.) The signal uctually obscr\'ed 
by device A- is/i.(/) ^ s{l) -f- nt{t). The m different ni,{t) are uncorrelatcd 
and have known spectra ;V((a)). Finally, the various noise spectra all have 
the same shape, but they differ in amplitude. Thus, 

7V,M = ./,A^M, /,: - 1, ■■■ ,m, (130) 

where the Jk hk known con.stants and ;V(aj) is a known function. The 
optimum physical filters are to be determined for estimating s(l -\- a) by 
summing linear operations on the A'arious /(.(/). 
T*aralleling (78) gives 

a. Yx(k is rrhp, 

b. E }\a-F,j ~ e^'S is rlhp, 



(140) 



Fu- - s + ./,.v, 








F,,i = S, j ^ k, 








c. When oj is real and — > ^ , 








1 Vifk 1' N and | ^ F.,,*.- - 


- e"" \ 


- S = 


= Our\ 



The indices k and j run through the integers 1 to m. 

A set of ni equations similai" to the pair in (79) may now l)c solved for 
the Y Ml; , ill terms of (unknown) rlhp functions Uk . Under the special 
conditions which we are now assuming, however, a solution is obtained 
more simply by examining linear combinations of the conditions (140/j), 
which give 

+ .S \ 2Z ym ~ ('""S is rlhp, 

/ ' : (141) 

(./.J'.m. - J,r,,>)A^ is rlhp. 
The second of these conditions can be satisfied, within the convergence 
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condition (140c), only it' 

./.r,,, - JjYMi. (U2) 

Bcrn\ise of (U2), the various ,A.(0 may first be summed with weights 
proportional to \/Jh and Uien a single frequency -dependent operator 
may be applied to the sum. In other words, only ii single "filtering" 
device is needed.* This confirms a result which might reasonably be ex- 
pected without formal analysis. The optimum filter characteristic is 
proportional to X) ^'a/a , and niay be found by applying the methods of 
Section :i2 to the first condition of (141). 

Tlie use of a single filler for all the channels may actually be dictated 
l)y cost considerations. The alxtve analysis confirms its use on a per- 
forniance basis. 

!.5 NotistafisHcal Nelwork Synthesis Applications 

Nonstatistical problems in network synthesis sometimes may be for- 
mulated in terms of the mathematics of data smoothing, even though no 
(lata smoothing is invoh-cd. In particular, reasonable solutions .sometimes 
may be found by minimizing integrals similar to those which represent 
our a'. This has been pointed out l>y Chang," with illustrations in tei'ms 
of a frequency-domain theory of optimum infinite-memory networks. 
The possibilities appear to lie much greater when a frecjuency-domain 
form of the finite-memory problem is available. Possible uses, however, 
liave not been explored in detail. The example described below will il- 
lustrate how proi)lems may be formulated. 

It will lie simplest to de\-elop the example in two stages. We will be- 
gin by seeking a physical network function F„(p) with the following 
jiroperties: The "step response" is to have a "rise time" T and is to be 
exactly 1 thereafter, as in Fig. 19(a). At the same time, | 1\; |" is to be 
small at real frequencies above a cutoff frequency, w ^ to, , as in Tig. 
19(b). ]\Iore exa<'tly, | Ya {' is to be as small as possible, within the rise- 
time restriction. In order to apply the mathematics of data-smoothing, 
we will use an average st|uare criterion o' to judge the effective smallness. 

The impulse response /\ „(/) is the derivative of the step response. It 
will have to be zero when t > T, as in 1-ig. 19(a). Also, l'(0) is equal to 
the step response at / = » , and it will have to be exactly 1 : 

Ka{t) = when / > 7', 

r,..(0) = 1. 

* When the Si, arc nol rcliitcd iis in 1139), the Ymu noiienilly share a common 
.set of poles, but differ ii) regiinl lo the residues at the jioles. 
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Fig. 19 — RequiremGnta imposed on a network response. 
If the average-square criterion a, applied to | Y q |^, ia unweighted, 

<r" = f I Fol'rfto. (144) 

The limited range of integration, however, makes the minimization prob- 
lem very difficult. A more tractable compromise is the following weighted 
average-square : 



'-£ 



Ya\' li dw, 



(145) 



where 



R = II rational function of w, 
= an even function ^ 0, 
= small when u is small. 

The problem now is as follows: find the particailar "physical" function 
I'm(?>) which makes a' of (145) a minimum, within the constraints (14;^). 
Mathematically, this is exactly the problem described m Section 4.1.1, 
except that the weight factor R replaces spectrum N'^'. 

Details of the weight factor R are arbitrary. They influence the dis- 
tribution of the loss, due to the network, in the high-loss region w > w^ . 
An efficient choice of R may place zeros and poles on the axis of real ui, 
with zeros < oj,. and poles > u^ . Real zeros and poles must be in identi- 
cal pairs to preserve /? ^ 0. One of each pair is interpreted as in each half 
plane, and | Y^ |" will always have zeros at real tu poles of 7^. Further 
analysis indicates the following: 

When all zeros and poles of R are at real w, 
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R = W\ 

1 (146) 

Ym = ^{u - ue ). 

Poles of l/W are canceled in Ym by zeros of the other factor. The func- 
lioii u may be determined by adapting the method of Section 4.1.1. 

The problem may now be modified hi the following way: the step re- 
sponHe need only approximate 1 when t > T. The approximation is to 
he judged on an average -square basis. There are now two average square 
criteria, referring respectively to the step response and the frequency 
suppression. The two criteria are to be added to obtain a measure of 
over-all performance. Since the step response is the integral of the impulse 
response 



a' = 1^" i Yo r 7? r/co + 1^ Fjf' Kain) rfn - 1 j dr, 



(147) 



the relative importance of the frequency suppression and the step re- 
sponse may be adjusted by adjusting the scale of the weight factor R. 

The problem may be solved by splitting Yg into the following two 
parts : 

Yoip) = Yi(p) + pY^ip), 

K,{t) = when i > T, ^^^^^ 

Ym = 1, 

K.Xl) = when / < T. 

The impulse response of F2 is the integral of the impulse response of 
pY2 (and is therefore equal to the step response of pY-i). Then the con- 
strahita on Ki{t) and 1u{t) are such that 

Applying Parseval's equation (T-15) now gives: 

J— 00 

The problem now is as follows: Find the "physical" 7i and Y2 which 
make a' a minimum, within the constraints (148). The problem may be 
solved by combining the method of Sections ;i.3, 3.4 and 4.1.1. 

The optimum network function Y m{v) is not realizable with a finite 
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network in either form of the pi'oblem, hut it may he approximated ar- 
hitrarily closely. It should also furnish a reference for judging tlio per- 
formance of finite networks designed in other ways. The method has not 
yet been tested by detailed calculations. 

4.6 Mare General Modifications of the Central Problems 

The specific problems described m Section.? 4.1 to 4.5 illustrate various 
more general ways in which the central data smoothing and prediction 
prolilems may be modified. These include tlie following: 

i. The restriction of the function class Cy , by constraints which 
specify the response to (certain frequencies (Section 4.1) or to certain more 
general time functions (Section 4.3). 

ii. The substitution of simplified spectra, which approximate true 
spectra only at frc<iucncies which arc actually utilized (Section 4.1.2). 

iii. The addition of signal or noise functions which involve a finite 
number of random variables (Sections 4.1.3, 4.1.4). 

iv. The estimation or prediction of a functional of the true signal, 
rather than the signal itself (Sections 4.1.3, 4.1.4). 

V. The substitution of random variables other than signal and noise, 
and the treatment of "biased" statistics (Sec^tion 4.2). 

vi. The use of more than two simultaneous observations (Section 4.4). 
vii. The appHcation of the mathematics of data smoothing to non- 
statistical problems (Section 4.5). 

Other modifications are possible, which have not been illustrated. For 
example, correlations between signal and noise may be handled very 
simply, by modifying the physical model illustrated in Fig. 8. It is only 
necessary to change the associated frequency functions, so as to generate 
the correct pertuient covariances listed in Section 3.2.4. The methods of 
Section 3.3 and 3.4 may be combined, to find two optimum operators 
Vol , Y(}2 restricted to finite memories. The single operator problem may 
be solved for signal and noise situations which are different in difl'erent 
segments of past time. In a special case, /((r) is observed when — qo < 
T < ( — T, and s{t) + w(t) is observed when t — T < t < t. Added 
complications, however, are likely to increase very drastically the num- 
ber of simultaneous linear e(iuations which must be solved to find Ym{p). 
A general solution of the following problem would be of interest to 
enguieers, particularly in connection with preliminary system studies: 
Suppose the signal or noise spectrum is not known in detail but is known 
to lie within some sort of limits. What I'o will give the best protection 
against a large a'? If the most unfavorable permitted spectrum is asso- 
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cintcd with each Yc, , what Y,; will make the t-orrosponding a' a miiii- 
nuiin, and how large will the niiiiimuin be? It appears that no general 
solution of this prolilem has i)eon achieved. 
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